[strongSwan] parallel ipsec processing / strongswan performance
Ruslan Kalakutsky
r.kalakutsky at gmail.com
Sun Mar 20 16:42:33 CET 2016
Hello,
I' would be grateful if someone could explain me how to enable
parallel ipsec processing and improve strongswan performance.
I have a testing system on AWS:
CentOS 7
$ uname -r
3.10.0-327.10.1.el7.x86_64
$ yum info
Installed Packages
Name : strongswan
Arch : x86_64
Version : 5.3.2
Release : 1.el7
Size : 2.9 M
Repo : installed
From repo : epel
strongswan --version
Linux strongSwan U5.3.2/K3.10.0-327.10.1.el7.x86_64
On this server the load-tester responder is configured:
/etc/strongswan/ipsec.conf:
config setup
charondebug=3
strictcrlpolicy=yes
uniqueids = never
conn %default
# default 3h
ikelifetime=120m
# IPsec SA expires, default 1h
keylife=60m
# Time before SA expiry the rekeying should start
rekeymargin=3m
# how many attempts, default 3
keyingtries=1
conn ikev2-with-eap-loadtest
keyexchange=ikev2
leftsubnet=0.0.0.0/0
leftfirewall=yes
leftid="CN=srv, OU=load-test, O=strongSwan"
leftauth=pubkey
leftcert=resp.pem
right=%any
rightsourceip=10.0.0.0/9
rightsendcert=never
rightauth=eap-mschapv2
eap_identity=%any
auto=add
/etc/strongswan/strongswan.d/charon.conf:
charon {
dos_protection = no
half_open_timeout = 30
ikesa_table_segments = 2560
ikesa_table_size = 32
reuse_ikesa = no
threads = 5120
processor {
priority_threads {
high = 2
medium = 4
}
}
}
...
As you can see there are ikesa table tuning (it is supposed to support
50K connection), disabled ddos, increased threads number, etc. like
advised at:
* https://wiki.strongswan.org/projects/strongswan/wiki/IkeSaTable
* https://wiki.strongswan.org/projects/strongswan/wiki/JobPriority
* https://wiki.strongswan.org/projects/strongswan/wiki/ExpiryRekey
On a client side (a few boxes with debian with strongSwan
U5.1.2/K3.13.0-74-generic)
/etc/strongswan.d/charon/load-tester.conf
load-tester {
child_rekey = 1200
delay = 500
delete_after_established = no
dpd_delay = 0
eap_password = testpwd
enable = yes
ike_rekey = 0
init_limit = 500000
initiator_auth = eap-mschap
initiator_id = loadtest-%d
issuer_cert = /etc/ipsec.d/cacerts/cacert.pem
ca_dir = /etc/ipsec.d/cacerts/
load = yes
mode = tunnel
proposal = aes128-sha1-modp2048
request_virtual_ip = yes
responder = 172.31.128.6
responder_auth = pubkey
shutdown_when_complete = yes
version = 0
addrs {
}
}
charon.conf:
charon {
threads = 10240 (
...
and tests run like this:
# ipsec load-tester initiate 7000 200
I've reviewed this document
https://www.strongswan.org/docs/Steffen_Klassert_Parallelizing_IPsec.pdf,
and tryed pcrypt module (from here:
https://wiki.strongswan.org/projects/strongswan/wiki/Pcrypt):
$ sudo modprobe pcrypt
$ sudo modprobe tcrypt alg="pcrypt(authenc(hmac(sha1),cbc(aes)))" type=3
the last is fail (as mentioned at link above).
And I believe it correspond to a load-tester configuration:
`aes128-sha1-modp2048`.
Also I made some net tuning:
$ sudo ip link set dev eth0 txqueuelen 2000
$ sudo sysctl -w net.core.somaxconn=2048
$ sudo sysctl -w net.core.rmem_max=16777216
$ sudo sysctl -w net.core.netdev_budget=600
So the situation is: I run load-tester initiators on 2-4 server
clients and look at server statistics.
1) all interrupts are at one CPU:
$ cat /proc/interrupts | grep -E 'CPU|eth0'
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6 CPU7
98: 246344 0 0 0 0 0
0 0 xen-dyn-event eth0
2) Htop's shows that only one charon thread at one CPU is busy (100%),
all others are lulled.
3) I've played with number of threads, from 128 up to 5120, with
interval of SA initiation, with backend where users are stored (from
plaintext ipsec.secret to RADIUS with SQL), with instance types of the
server - but all results quite the same: only one CPU is working and
regardless of settings at about **3500** concurrent connection query
of half-opened SA started to grow and clients started to have
retransmissions issues. On server side it looks like:
$ sudo swanctl -S
uptime: 18 minutes, since Mar 20 14:40:34 2016
worker threads: 5120 total, 0 idle, working: 6/2/2558/2554
job queues: 0/0/56/1732
jobs scheduled: 19203
IKE_SAs: 9764 total, 2845 half-open
mallinfo: sbrk 272531456, mmap 528384, used 238963392, free 33568064
$ sudo netstat -s
Ip:
180188 total packets received
0 forwarded
0 incoming packets discarded
180187 incoming packets delivered
90058 requests sent out
16 dropped because of missing route
...
Udp:
173971 packets received
1235 packets to unknown port received.
443 packet receive errors
86006 packets sent
0 receive buffer errors
0 send buffer errors
$ ip -s link
...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast
state UP mode DEFAULT qlen 2000
link/ether 06:47:5b:74:9c:65 brd ff:ff:ff:ff:ff:ff
RX: bytes packets errors dropped overrun mcast
43746734 180568 0 0 0 0
TX: bytes packets errors dropped carrier collsns
33184525 90883 0 0 0 0
and at log I have messages like this:
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
Mar 20 15:18:49 ip-172-31-128-6 charon: 4739[MGR] ignoring
request with ID 6, already processing
So the questions are how to enable multi CPU support and how to break
this 3.5K users limit? I mean I can have e.g. 12K users, but from 3-4K
users there is a significant performance issues, while the server
isn't really loaded.
Best Regarards,
Ruslan Kalakutsky
More information about the Users
mailing list