[strongSwan] Maximizing throughput / kernel bottlenecks

Wed Mar 16 00:38:52 CET 2016

I've recently migrated from KAME / racoon on Debian Stable and the 
latest Ubuntu Server LTS to strongswan, and like it a lot more in terms
of configuration and diagnostics. However I haven't been able to eek out 
much more in terms of throughput, which makes sense as both just utilize
the kernel crypto stack. However I'm wondering what people are getting
for speeds in general.

To keep crypto overhead down the IPsec tunnels are constructed in
transport mode with aes128/sha1 for IKE and aes128/md5 for IPsec; GRE 
is layered on top of that to handle OSPF and BGP internally. On machines
connected via gig-e I'm getting between 150 - 200 mb/s on average over
the tunnels (900+ mb/s unencrypted). One of those machines also has a
tunnel over to my home via a consumer internet connection (10mb up, 50mb
down) but I'm getting relatively slow speeds: ~20mb/s through the
tunnel. It pushes the max speed of 50mb/s to the same host when
unencrypted.. 

The endpoint on the home internet connection is the router itself, not a
device behind a router. The router isn't beefy by any stretch, but it's
not under any load (hovers around 0 99% of the time): an AMD A6400 with
8 gigs of RAM, of which 226 MBs are in use and another 700 in buffers.
The AES-ni module is loaded, though I don't think it matters either way.
Throughput's been tested multiple times with no other traffic occurring
at the same time via iperf, FTP, and generic netcats, both in and
outside the GRE tunnel.

Any idea why the throughput is cut so badly to the home internet router?
I expect some overhead, but it seems odd not to get at least 40mb/s? 
Also, is there any way to improve the throughput between the two gig-e 
connected machines? The speed cut seems drastic versus the unencrypted 
throughput. The only thing I haven't done is migrate over to IKEv2 
which is on the roadmap but haven't implemented yet due to some legacy 
requirements, however I can't imagine that would actually effect 
throughput as that seems to be a kernel bottleneck.

hose