[strongSwan] Throughput on high BDP networks

Sat Jun 6 04:07:54 CEST 2015

> On June 5, 2015 at 3:14 PM "Michael C. Cambria" <mcc at fid4.com> wrote:
>
>
>
>
> On 06/04/2015 11:28 AM, jsullivan at opensourcedevel.com wrote:
>
> [deleted]
>
> >>
> >> <snip>
> >> We appear to be chasing a compound problem perhaps also involving
> >> problems with GRE. As we try to isolate components, one issue we see is
> >> TCP Window size. For some reason, even though the w/rmem_max and tcp
> >> have maximum values over 16M, we are not achieving a TCP Window size
> >> much larger than 4M when we add IPSec to the mix. Not only does this
> >> seem to be the case when we are using IPSec only but, if we add a GRE
> >> tunnel (to make it a little easier to do a packet trace), with GRE only,
> >> we see the TCP window size go to the full 16M (but we have a problem
> >> with packet drops). When we add IPSec (GRE/IPSec), the packet drops
> >> magically go away (perhaps due to the lower throughput) but the TCP
> >> Window size stays stuck at that 4M level.
> >>
> >> What would cause this and how do we get the full sized TCP Window inside
> >> an IPSec transport stream? Thanks - John
> >>
> >> <snip>
> > I suppose this might imply that the receiving station cannot drain the queue
> > faster than 421 Mbps but I do not see the bottleneck. There are no drops in
> > the
> > NIC ring buffers:
> > root at lcppeppr-labc02:~# ethtool -S eth5 | grep drop
> > rx_dropped: 0
> > tx_dropped: 0
> > rx_fcoe_dropped: 0
> > root at lcppeppr-labc02:~# ethtool -S eth7 | grep drop
> > rx_dropped: 0
> > tx_dropped: 0
> > rx_fcoe_dropped: 0
> > There are no drops at the IP level:
> >
> > Plenty of receive buffer space:
> > net.core.netdev_max_backlog = 10000
> > net.core.rmem_max = 16782080
> > net.core.wmem_max = 16777216
> > net.ipv4.tcp_rmem = 8960 89600 16782080
> > net.ipv4.tcp_wmem = 4096 65536 16777216
> >
> > CPUs are not overloaded nor software interrupts excessive:
> > top - 11:27:02 up 16:58, 1 user, load average: 1.00, 0.56, 0.29
> > Tasks: 189 total, 3 running, 186 sleeping, 0 stopped, 0 zombie
> > %Cpu0 : 0.1 us, 3.4 sy, 0.0 ni, 96.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu1 : 0.0 us, 2.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu2 : 0.0 us, 4.1 sy, 0.0 ni, 95.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu3 : 0.1 us, 15.7 sy, 0.0 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu4 : 0.3 us, 12.6 sy, 0.0 ni, 87.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu5 : 0.0 us, 9.3 sy, 0.0 ni, 80.9 id, 0.0 wa, 0.0 hi, 9.9 si, 0.0 st
> > %Cpu6 : 0.0 us, 2.5 sy, 0.0 ni, 97.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu7 : 0.0 us, 2.6 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu8 : 0.0 us, 29.2 sy, 0.0 ni, 70.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu9 : 0.0 us, 19.8 sy, 0.0 ni, 75.9 id, 0.0 wa, 0.1 hi, 4.2 si, 0.0 st
> > %Cpu10 : 0.0 us, 3.1 sy, 0.0 ni, 96.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > %Cpu11 : 0.0 us, 3.7 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> > KiB Mem: 32985376 total, 404108 used, 32581268 free, 33604 buffers
> > KiB Swap: 7836604 total, 0 used, 7836604 free, 83868 cached
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> >
> > 7545 root 20 0 6280 1492 1288 S 30 0.0 0:14.05 nuttcp
> >
> > 49 root 20 0 0 0 0 S 24 0.0 0:51.33 kworker/8:0
> >
> > 7523 root 20 0 0 0 0 S 22 0.0 0:37.08 kworker/9:0
> >
> > 7526 root 20 0 0 0 0 S 6 0.0 0:10.22 kworker/5:2
> >
> > 1441 root 20 0 0 0 0 S 4 0.0 0:13.14 kworker/11:2
> >
> > 7527 root 20 0 0 0 0 S 4 0.0 0:07.07 kworker/2:1
> >
> > 7458 root 20 0 0 0 0 S 4 0.0 0:06.89 kworker/8:2
> >
> > 33 root 20 0 0 0 0 S 4 0.0 1:12.46 ksoftirqd/5
> >
> > 7528 root 20 0 0 0 0 S 4 0.0 0:06.40 kworker/10:2
> >
> > 7524 root 20 0 0 0 0 S 3 0.0 0:05.57 kworker/1:2
> >
> > 7525 root 20 0 0 0 0 R 3 0.0 0:05.61 kworker/0:0
> >
> > 6131 root 20 0 0 0 0 S 3 0.0 0:40.66 kworker/7:2
> >
> > 7531 root 20 0 0 0 0 S 3 0.0 0:06.51 kworker/8:1
> >
> > 7519 root 20 0 0 0 0 R 3 0.0 0:01.50 kworker/6:2
> >
> > 89 root 20 0 0 0 0 S 3 0.0 0:16.06 kworker/3:1
> >
> > 1972 root 20 0 0 0 0 S 2 0.0 0:22.64 kworker/4:2
> >
> > 6828 root 20 0 0 0 0 S 2 0.0 0:03.84 kworker/3:2
> >
> > 6047 root 20 0 0 0 0 S 2 0.0 0:23.63 kworker/9:1
> >
> > 7123 root 20 0 0 0 0 S 2 0.0 0:03.58 kworker/9:2
> >
> > 7300 root 20 0 0 0 0 S 2 0.0 0:03.18 kworker/4:0
> >
> > 4632 root 0 -20 15900 4828 1492 S 0 0.0 2:33.04 conntrackd
> >
> > 7337 root 20 0 0 0 0 S 0 0.0 0:00.14 kworker/10:0
> >
> > 7529 root 20 0 0 0 0 S 0 0.0 0:05.59 kworker/11:0
> >
> > 91 root 20 0 0 0 0 S 0 0.0 0:39.70 kworker/5:1
> >
> > 4520 root 20 0 4740 1780 1088 S 0 0.0 0:14.17 haveged
> >
> > 7221 root 20 0 0 0 0 S 0 0.0 0:00.52 kworker/5:0
> >
> > 7112 root 20 0 0 0 0 S 0 0.0 0:00.09 kworker/11:1
> >
> > 7543 root 20 0 0 0 0 S 0 0.0 0:00.05 kworker/u24:2
> >
> > 1 root 20 0 10660 1704 1564 S 0 0.0 0:02.38 init
> >
> > 2 root 20 0 0 0 0 S 0 0.0 0:00.01 kthreadd
> >
> > 3 root 20 0 0 0 0 S 0 0.0 0:00.46 ksoftirqd/0
> >
> > 5 root 0 -20 0 0 0 S 0 0.0 0:00.00 kworker/0:0H
> >
> >
> > Where would the bottleneck be? Thanks - John
>
> Just a thought, could ESP replay window be getting in the way?
>
> IIRC, IKE would need to re-key before the 32 bit sequence number space
> wraps. The default for charon.replay_window is 32 packets (not 32 bits)
> which would also throttle unless you alreasdy thought of this and
> changed ipse.conf/strongswan.conf. If so, sorry for the noise
>
> MikeC
>
>
>
>
Thanks, Mike.  We played with it trying to disable it but it only stayed at 32.
Are you saying we should increase it? If so, how do we determine the optimal
value? - John