[strongSwan] Performance (latency) in a Hub and Spoke setup

Martin Sand dborn at gmx.net
Mon Jan 1 16:21:56 CET 2018


Thanks Noel and Thomas.

I did a lot of investigation over the weekend and it seems like these 
error messages are traceroute and tracepath specific issues.
There was a post on serverfault explaining the background [1]. So I will 
not further invest into this.

So I think I cannot further improve the performance. It is limited by 
the upload speed of the spoke routers.

Happy New Year and best regards
Martin


[1] 
https://serverfault.com/questions/623996/how-to-enable-traceroute-in-linux-machine



On 30.12.2017 23:03, Noel Kuntze wrote:
> Hi Martin,
>
> That can be relevant.
>
> That is an ICMP message of the router or recipient 210.211.212.213 to 192.168.2.135 complaining that the TTL [ of the TCP packet from 192.168.2.135 to 192.168.1.130 with the ID 63979 ] reached 0. Under the strong assumption
> that a standard TTL is used (meaning you didn't change it to some low value), that means that you have a routing loop somewhere in your network, that the complained about packet got into.
>
> TL;DR: You likely got a routing loop. You need to find and fix it.
>
> Kind regards
>
> Noel
>
> On 30.12.2017 22:47, Martin Sand wrote:
>> Hi Noel
>>
>> Thanks for the advice. I installed tcpdump and wireshark and added a rule to log ICMP errors.
>> This is an excerpt from the log file. I assume this line shows something is sent to port 80 but I cannot find the corresponding iptables entry.
>>
>> Dec 30 21:42:11 localhost kernel: [1423944.393321] IN= OUT=eth0 SRC=210.211.212.213 DST=192.168.2.135 LEN=88 TOS=0x00 PREC=0xC0 TTL=64 ID=38805 PROTO=ICMP TYPE=11 CODE=0 [SRC=192.168.2.135 DST=192.168.1.130 LEN=60 TOS=0x00 PREC=0x00 TTL=1 ID=63979 DF PROTO=TCP SPT=47511 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 ]
>>
>> Best regards
>> Martin
>>
>>
>> On 28.12.2017 01:43, Noel Kuntze wrote:
>>> Hi,
>>>
>>> Looks like your firewall rules on the hub are broken and cause the problems or you need to configure an additional CHILD_SA to tunnel ICMP errors from the hub, because it has no IP in the local TS.
>>> Check both those suspicions.
>>>
>>> Kind regards
>>>
>>> Noel
>>>
>>> On 27.12.2017 23:00, Martin Sand wrote:
>>>> Thanks again Noel.
>>>>
>>>> I have executed `traceroute -T --mtu <destination>` and `mtr -rw <destination>` on machines at both locations.
>>>> I did not do further investigation on the MSS yet since I have this strange packet loss.
>>>> Based on the route, I assume this happens at the hub which is in between the two routers?
>>>> Could this be the root cause I need to further investigate?
>>>>
>>>> Kind regards
>>>> Martin
>>>>
>>>> traceroute -T --mtu pi-frankfurt
>>>> traceroute to pi-frankfurt (192.168.2.135), 30 hops max, 60 byte packets
>>>>    1  router-freiburg (192.168.1.1)  0.263 ms  0.179 ms  0.172 ms
>>>>    2  * * *
>>>>    3  router-frankfurt (192.168.2.1)  41.762 ms  41.182 ms  36.716 ms
>>>>    4  pi-frankfurt (192.168.2.135)  36.693 ms  43.629 ms  37.051 ms
>>>>
>>>> traceroute -T --mtu pi-freiburg
>>>> traceroute to pi-freiburg (192.168.1.130), 30 hops max, 60 byte packets
>>>>    1  router-frankfurt (192.168.2.1)  0.489 ms  0.381 ms  0.287 ms
>>>>    2  * * *
>>>>    3  router-freiburg (192.168.1.1)  38.368 ms  47.673 ms  35.441 ms
>>>>    4  pi-freiburg (192.168.1.130)  39.456 ms  54.566 ms  36.117 ms
>>>>
>>>> mtr -rw pi-frankfurt
>>>> Start: 2017-12-27T22:57:40+0100
>>>> HOST: workstation              Loss%   Snt   Last   Avg  Best Wrst StDev
>>>>     1.|-- router-freiburg         0.0%    10    0.2   0.2   0.2 0.3   0.0
>>>>     2.|-- ???                      100.0    10    0.0   0.0   0.0 0.0   0.0
>>>>     3.|-- router-frankfurt        0.0%    10   33.3  35.5  32.5 42.0   2.7
>>>>     4.|-- pi-frankfurt              0.0%    10   33.5  34.4  32.7 36.7   1.5
>>>>
>>>>
>>>> On 27.12.2017 21:08, Noel Kuntze wrote:
>>>>> Hi,
>>>>>
>>>>> You can test the convergence speed using `traceroute -T --mtu <destination>`, but that only gives you the MTU. You need to manually discover the MSS
>>>>> using `traceroute -T -O mss=<mss> <destination>`.
>>>>>
>>>>> The best way to check if the problem continues is to just run tcpdump/wireshark and check for ICMP Fragmenation needed packets and TCP errors or timeouts.
>>>>>
>>>>> Kind regards
>>>>>
>>>>> Noel
>>>>>
>>>>> On 27.12.2017 17:12, Martin Sand wrote:
>>>>>> Thanks Noel. Sorry, I had to travel to the other location (350 km).
>>>>>>
>>>>>> I adapted the iptable rules. It improved, but I have the impression it only improved a bit.
>>>>>> Is there a way to measure MTU discovery time?
>>>>>>
>>>>>> Kind regards
>>>>>> Martin
>>>>>>
>>>>>>
>>>>>> On 14.12.2017 13:51, Noel Kuntze wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>>> VPN internal http requests to a web server of another spoke take some time until the page is rendered.
>>>>>>>> I assume this is due to the latency.
>>>>>>> Nah. It's extremely more likely that the path MTU discovery takes some time (maybe due to some missing/wrong firewall rules on some host(s) in your network topology).
>>>>>>> Try lowering the MTU and MSS of the tunneled traffic[1].
>>>>>>>
>>>>>>> Kind regards
>>>>>>>
>>>>>>> Noel
>>>>>>>
>>>>>>> [1] https://wiki.strongswan.org/projects/strongswan/wiki/ForwardingAndSplitTunneling#MTUMSS-issues
>>>>>>>
>>>>>>> On 14.12.2017 13:41, Martin Sand wrote:
>>>>>>>> Hi all
>>>>>>>>
>>>>>>>> I have a Hub and Spoke setup. Connections are working perfectly fine.
>>>>>>>> Throughput is almost reaching the maximum rate of the upload channel speed, 10 MBit/s.
>>>>>>>>
>>>>>>>> Unfortunately the latency is not fulfilling my objectives. I have an average ping time of 39 ms (see below) when pinging clients on other spokes.
>>>>>>>> VPN internal http requests to a web server of another spoke take some time until the page is rendered.
>>>>>>>> I assume this is due to the latency.
>>>>>>>>
>>>>>>>> Is there any chance to improve the latency? Or is the latency perfectly good?
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> Hub internet address
>>>>>>>> 64 bytes from vpn.example.com (217.122.5.6): icmp_seq=1 ttl=57 time=15.2 ms
>>>>>>>>
>>>>>>>> Internal address of Hub
>>>>>>>> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
>>>>>>>> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=62 time=40.4 ms
>>>>>>>>
>>>>>>>> Client on another spoke
>>>>>>>> PING 192.168.1.130 (192.168.1.130) 56(84) bytes of data.
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=1 ttl=61 time=108 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=2 ttl=61 time=41.8 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=3 ttl=61 time=38.0 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=4 ttl=61 time=35.2 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=5 ttl=61 time=36.4 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=6 ttl=61 time=39.1 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=7 ttl=61 time=38.1 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=8 ttl=61 time=41.6 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=9 ttl=61 time=36.0 ms
>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=10 ttl=61 time=36.7 ms
>>>>>>>>
>>>>>>>> --- 192.168.1.130 ping statistics ---
>>>>>>>> 10 packets transmitted, 10 received, 0% packet loss, time 9013ms
>>>>>>>> rtt min/avg/max/mdev = 35.295/45.159/108.281/21.146 ms
>>>>>>>>



More information about the Users mailing list