[strongSwan] Throughput on high BDP networks
Michael C. Cambria
mcc at fid4.com
Fri Jun 5 21:14:12 CEST 2015
On 06/04/2015 11:28 AM, jsullivan at opensourcedevel.com wrote:
[deleted]
>>
>> <snip>
>> We appear to be chasing a compound problem perhaps also involving
>> problems with GRE. As we try to isolate components, one issue we see is
>> TCP Window size. For some reason, even though the w/rmem_max and tcp
>> have maximum values over 16M, we are not achieving a TCP Window size
>> much larger than 4M when we add IPSec to the mix. Not only does this
>> seem to be the case when we are using IPSec only but, if we add a GRE
>> tunnel (to make it a little easier to do a packet trace), with GRE only,
>> we see the TCP window size go to the full 16M (but we have a problem
>> with packet drops). When we add IPSec (GRE/IPSec), the packet drops
>> magically go away (perhaps due to the lower throughput) but the TCP
>> Window size stays stuck at that 4M level.
>>
>> What would cause this and how do we get the full sized TCP Window inside
>> an IPSec transport stream? Thanks - John
>>
>> <snip>
> I suppose this might imply that the receiving station cannot drain the queue
> faster than 421 Mbps but I do not see the bottleneck. There are no drops in the
> NIC ring buffers:
> root at lcppeppr-labc02:~# ethtool -S eth5 | grep drop
> rx_dropped: 0
> tx_dropped: 0
> rx_fcoe_dropped: 0
> root at lcppeppr-labc02:~# ethtool -S eth7 | grep drop
> rx_dropped: 0
> tx_dropped: 0
> rx_fcoe_dropped: 0
> There are no drops at the IP level:
>
> Plenty of receive buffer space:
> net.core.netdev_max_backlog = 10000
> net.core.rmem_max = 16782080
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 8960 89600 16782080
> net.ipv4.tcp_wmem = 4096 65536 16777216
>
> CPUs are not overloaded nor software interrupts excessive:
> top - 11:27:02 up 16:58, 1 user, load average: 1.00, 0.56, 0.29
> Tasks: 189 total, 3 running, 186 sleeping, 0 stopped, 0 zombie
> %Cpu0 : 0.1 us, 3.4 sy, 0.0 ni, 96.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu1 : 0.0 us, 2.0 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu2 : 0.0 us, 4.1 sy, 0.0 ni, 95.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu3 : 0.1 us, 15.7 sy, 0.0 ni, 84.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu4 : 0.3 us, 12.6 sy, 0.0 ni, 87.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu5 : 0.0 us, 9.3 sy, 0.0 ni, 80.9 id, 0.0 wa, 0.0 hi, 9.9 si, 0.0 st
> %Cpu6 : 0.0 us, 2.5 sy, 0.0 ni, 97.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu7 : 0.0 us, 2.6 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu8 : 0.0 us, 29.2 sy, 0.0 ni, 70.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu9 : 0.0 us, 19.8 sy, 0.0 ni, 75.9 id, 0.0 wa, 0.1 hi, 4.2 si, 0.0 st
> %Cpu10 : 0.0 us, 3.1 sy, 0.0 ni, 96.9 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> %Cpu11 : 0.0 us, 3.7 sy, 0.0 ni, 96.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> KiB Mem: 32985376 total, 404108 used, 32581268 free, 33604 buffers
> KiB Swap: 7836604 total, 0 used, 7836604 free, 83868 cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>
> 7545 root 20 0 6280 1492 1288 S 30 0.0 0:14.05 nuttcp
>
> 49 root 20 0 0 0 0 S 24 0.0 0:51.33 kworker/8:0
>
> 7523 root 20 0 0 0 0 S 22 0.0 0:37.08 kworker/9:0
>
> 7526 root 20 0 0 0 0 S 6 0.0 0:10.22 kworker/5:2
>
> 1441 root 20 0 0 0 0 S 4 0.0 0:13.14 kworker/11:2
>
> 7527 root 20 0 0 0 0 S 4 0.0 0:07.07 kworker/2:1
>
> 7458 root 20 0 0 0 0 S 4 0.0 0:06.89 kworker/8:2
>
> 33 root 20 0 0 0 0 S 4 0.0 1:12.46 ksoftirqd/5
>
> 7528 root 20 0 0 0 0 S 4 0.0 0:06.40 kworker/10:2
>
> 7524 root 20 0 0 0 0 S 3 0.0 0:05.57 kworker/1:2
>
> 7525 root 20 0 0 0 0 R 3 0.0 0:05.61 kworker/0:0
>
> 6131 root 20 0 0 0 0 S 3 0.0 0:40.66 kworker/7:2
>
> 7531 root 20 0 0 0 0 S 3 0.0 0:06.51 kworker/8:1
>
> 7519 root 20 0 0 0 0 R 3 0.0 0:01.50 kworker/6:2
>
> 89 root 20 0 0 0 0 S 3 0.0 0:16.06 kworker/3:1
>
> 1972 root 20 0 0 0 0 S 2 0.0 0:22.64 kworker/4:2
>
> 6828 root 20 0 0 0 0 S 2 0.0 0:03.84 kworker/3:2
>
> 6047 root 20 0 0 0 0 S 2 0.0 0:23.63 kworker/9:1
>
> 7123 root 20 0 0 0 0 S 2 0.0 0:03.58 kworker/9:2
>
> 7300 root 20 0 0 0 0 S 2 0.0 0:03.18 kworker/4:0
>
> 4632 root 0 -20 15900 4828 1492 S 0 0.0 2:33.04 conntrackd
>
> 7337 root 20 0 0 0 0 S 0 0.0 0:00.14 kworker/10:0
>
> 7529 root 20 0 0 0 0 S 0 0.0 0:05.59 kworker/11:0
>
> 91 root 20 0 0 0 0 S 0 0.0 0:39.70 kworker/5:1
>
> 4520 root 20 0 4740 1780 1088 S 0 0.0 0:14.17 haveged
>
> 7221 root 20 0 0 0 0 S 0 0.0 0:00.52 kworker/5:0
>
> 7112 root 20 0 0 0 0 S 0 0.0 0:00.09 kworker/11:1
>
> 7543 root 20 0 0 0 0 S 0 0.0 0:00.05 kworker/u24:2
>
> 1 root 20 0 10660 1704 1564 S 0 0.0 0:02.38 init
>
> 2 root 20 0 0 0 0 S 0 0.0 0:00.01 kthreadd
>
> 3 root 20 0 0 0 0 S 0 0.0 0:00.46 ksoftirqd/0
>
> 5 root 0 -20 0 0 0 S 0 0.0 0:00.00 kworker/0:0H
>
>
> Where would the bottleneck be? Thanks - John
Just a thought, could ESP replay window be getting in the way?
IIRC, IKE would need to re-key before the 32 bit sequence number space
wraps. The default for charon.replay_window is 32 packets (not 32 bits)
which would also throttle unless you alreasdy thought of this and
changed ipse.conf/strongswan.conf. If so, sorry for the noise
MikeC
More information about the Users
mailing list