[strongSwan] Maximizing throughput / kernel bottlenecks

Hose hose+strongswan at bluemaggottowel.com
Mon Apr 11 20:00:01 CEST 2016


What you say...Martin Willi (martin at strongswan.org):

> Hi,
> 
> > There is no appreaciable load on any of the systems
> > during throughput testing.
> 
> Please note that IPsec is usually processed in soft IRQ, so have a look
> at the "si" field in top. If you are CPU bound, "perf" is very powerful
> in analyzing the bottleneck on productive systems. If you are not CPU
> bound, something else is probably wrong (packet loss, etc.).

I'll have to look into perf; my si stat isn't really going that high.
Its maximum is 13.4. That doesn't seem excessive, though it's not great.

> > I've read that aes-gcm has been built to scale to 10ge and 40ge,
> 
> It has, but saturating such links definitely requires hardware support.
> 
> > Does anyone else have experience with higher throughput on
> > their IPsec tunnels, whether or not utilizing aes-gcm?
> 
> If your CPU has AESNI/CLMUL support, depending on your CPU you should
> at least get close to saturating a Gigabit link, even if using a single
> core only.
> 
> If you have multiple tunnels, a NIC with multiple hardware queues can
> share the load to more cores; if not pcrypt is an option.
> 
> With traditional algorithms you should achieve around 200-400Mbit, so
> you should go for AES-GCM if your hardware supports it (make sure to
> have rfc4106-gcm-aesni in /proc/crypto). Alternatively, you might give
> the newer chacha20poly1305 AEAD a try; it provides good performance in
> software, and even better performance with SSE2/AVX2 (since Linux 4.3).
> 
> Regards
> Martin

I did switch to aes-gcm, though didn't get any performance benefit out
of it. So that, plus the fact that between three systems with tunnels
between them (one of which is quite old - running a dual core netburst
P4 @2.8, the other two are VMs on decent hardware, all of which have no
load) are hitting walls at 300mb/s but can hit 980mb/s unencrypted,
leads me to some kind of kernel bottlneck. The two VMs have aesni
support, or at least the aesni extension is getting passed through the
hypervisor to them.

I do have access to very beefy hardware that I could run baremetal, but
I'd like to only use that as last resort for testing. At this point I'd
love to have a working 700mb/s+ tunnel so I could use it to troubleshoot
the other tunnels / hardware. I feel like a bog standard VM with no
resource contention and no local load would be able to push at least
500mb/s without too much tweaking.

hose


More information about the Users mailing list