[strongSwan] Performance (latency) in a Hub and Spoke setup

Sat Jan 20 09:51:17 CET 2018

I am at the other location right now.

Where should I capture the traffic - hub, spoke router, spoke http 
server, my client?

Best regards
Martin

On 01/03/2018 11:17 PM, Noel Kuntze wrote:
> Hi,
>
> If you used tracepath -T, then that message you posted earlier could indeed be caused by tracepath and not be the actual problem.
>
> Did you actually test that? What is the upload speed of the router? I strongly doubt the problem with the HTTP latency is caused by a throughput problem.
> Could you possibly provide a tcpdump of traffic when the problem occurs?
>
> Kind regards
>
> Noel
>
> On 01.01.2018 16:21, Martin Sand wrote:
>> Thanks Noel and Thomas.
>>
>> I did a lot of investigation over the weekend and it seems like these error messages are traceroute and tracepath specific issues.
>> There was a post on serverfault explaining the background [1]. So I will not further invest into this.
>>
>> So I think I cannot further improve the performance. It is limited by the upload speed of the spoke routers.
>>
>> Happy New Year and best regards
>> Martin
>>
>>
>> [1] https://serverfault.com/questions/623996/how-to-enable-traceroute-in-linux-machine
>>
>>
>>
>> On 30.12.2017 23:03, Noel Kuntze wrote:
>>> Hi Martin,
>>>
>>> That can be relevant.
>>>
>>> That is an ICMP message of the router or recipient 210.211.212.213 to 192.168.2.135 complaining that the TTL [ of the TCP packet from 192.168.2.135 to 192.168.1.130 with the ID 63979 ] reached 0. Under the strong assumption
>>> that a standard TTL is used (meaning you didn't change it to some low value), that means that you have a routing loop somewhere in your network, that the complained about packet got into.
>>>
>>> TL;DR: You likely got a routing loop. You need to find and fix it.
>>>
>>> Kind regards
>>>
>>> Noel
>>>
>>> On 30.12.2017 22:47, Martin Sand wrote:
>>>> Hi Noel
>>>>
>>>> Thanks for the advice. I installed tcpdump and wireshark and added a rule to log ICMP errors.
>>>> This is an excerpt from the log file. I assume this line shows something is sent to port 80 but I cannot find the corresponding iptables entry.
>>>>
>>>> Dec 30 21:42:11 localhost kernel: [1423944.393321] IN= OUT=eth0 SRC=210.211.212.213 DST=192.168.2.135 LEN=88 TOS=0x00 PREC=0xC0 TTL=64 ID=38805 PROTO=ICMP TYPE=11 CODE=0 [SRC=192.168.2.135 DST=192.168.1.130 LEN=60 TOS=0x00 PREC=0x00 TTL=1 ID=63979 DF PROTO=TCP SPT=47511 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 ]
>>>>
>>>> Best regards
>>>> Martin
>>>>
>>>>
>>>> On 28.12.2017 01:43, Noel Kuntze wrote:
>>>>> Hi,
>>>>>
>>>>> Looks like your firewall rules on the hub are broken and cause the problems or you need to configure an additional CHILD_SA to tunnel ICMP errors from the hub, because it has no IP in the local TS.
>>>>> Check both those suspicions.
>>>>>
>>>>> Kind regards
>>>>>
>>>>> Noel
>>>>>
>>>>> On 27.12.2017 23:00, Martin Sand wrote:
>>>>>> Thanks again Noel.
>>>>>>
>>>>>> I have executed `traceroute -T --mtu <destination>` and `mtr -rw <destination>` on machines at both locations.
>>>>>> I did not do further investigation on the MSS yet since I have this strange packet loss.
>>>>>> Based on the route, I assume this happens at the hub which is in between the two routers?
>>>>>> Could this be the root cause I need to further investigate?
>>>>>>
>>>>>> Kind regards
>>>>>> Martin
>>>>>>
>>>>>> traceroute -T --mtu pi-frankfurt
>>>>>> traceroute to pi-frankfurt (192.168.2.135), 30 hops max, 60 byte packets
>>>>>>     1  router-freiburg (192.168.1.1)  0.263 ms  0.179 ms  0.172 ms
>>>>>>     2  * * *
>>>>>>     3  router-frankfurt (192.168.2.1)  41.762 ms  41.182 ms  36.716 ms
>>>>>>     4  pi-frankfurt (192.168.2.135)  36.693 ms  43.629 ms  37.051 ms
>>>>>>
>>>>>> traceroute -T --mtu pi-freiburg
>>>>>> traceroute to pi-freiburg (192.168.1.130), 30 hops max, 60 byte packets
>>>>>>     1  router-frankfurt (192.168.2.1)  0.489 ms  0.381 ms  0.287 ms
>>>>>>     2  * * *
>>>>>>     3  router-freiburg (192.168.1.1)  38.368 ms  47.673 ms  35.441 ms
>>>>>>     4  pi-freiburg (192.168.1.130)  39.456 ms  54.566 ms  36.117 ms
>>>>>>
>>>>>> mtr -rw pi-frankfurt
>>>>>> Start: 2017-12-27T22:57:40+0100
>>>>>> HOST: workstation              Loss%   Snt   Last   Avg  Best Wrst StDev
>>>>>>      1.|-- router-freiburg         0.0%    10    0.2   0.2   0.2 0.3   0.0
>>>>>>      2.|-- ???                      100.0    10    0.0   0.0   0.0 0.0   0.0
>>>>>>      3.|-- router-frankfurt        0.0%    10   33.3  35.5  32.5 42.0   2.7
>>>>>>      4.|-- pi-frankfurt              0.0%    10   33.5  34.4  32.7 36.7   1.5
>>>>>>
>>>>>>
>>>>>> On 27.12.2017 21:08, Noel Kuntze wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> You can test the convergence speed using `traceroute -T --mtu <destination>`, but that only gives you the MTU. You need to manually discover the MSS
>>>>>>> using `traceroute -T -O mss=<mss> <destination>`.
>>>>>>>
>>>>>>> The best way to check if the problem continues is to just run tcpdump/wireshark and check for ICMP Fragmenation needed packets and TCP errors or timeouts.
>>>>>>>
>>>>>>> Kind regards
>>>>>>>
>>>>>>> Noel
>>>>>>>
>>>>>>> On 27.12.2017 17:12, Martin Sand wrote:
>>>>>>>> Thanks Noel. Sorry, I had to travel to the other location (350 km).
>>>>>>>>
>>>>>>>> I adapted the iptable rules. It improved, but I have the impression it only improved a bit.
>>>>>>>> Is there a way to measure MTU discovery time?
>>>>>>>>
>>>>>>>> Kind regards
>>>>>>>> Martin
>>>>>>>>
>>>>>>>>
>>>>>>>> On 14.12.2017 13:51, Noel Kuntze wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>> VPN internal http requests to a web server of another spoke take some time until the page is rendered.
>>>>>>>>>> I assume this is due to the latency.
>>>>>>>>> Nah. It's extremely more likely that the path MTU discovery takes some time (maybe due to some missing/wrong firewall rules on some host(s) in your network topology).
>>>>>>>>> Try lowering the MTU and MSS of the tunneled traffic[1].
>>>>>>>>>
>>>>>>>>> Kind regards
>>>>>>>>>
>>>>>>>>> Noel
>>>>>>>>>
>>>>>>>>> [1] https://wiki.strongswan.org/projects/strongswan/wiki/ForwardingAndSplitTunneling#MTUMSS-issues
>>>>>>>>>
>>>>>>>>> On 14.12.2017 13:41, Martin Sand wrote:
>>>>>>>>>> Hi all
>>>>>>>>>>
>>>>>>>>>> I have a Hub and Spoke setup. Connections are working perfectly fine.
>>>>>>>>>> Throughput is almost reaching the maximum rate of the upload channel speed, 10 MBit/s.
>>>>>>>>>>
>>>>>>>>>> Unfortunately the latency is not fulfilling my objectives. I have an average ping time of 39 ms (see below) when pinging clients on other spokes.
>>>>>>>>>> VPN internal http requests to a web server of another spoke take some time until the page is rendered.
>>>>>>>>>> I assume this is due to the latency.
>>>>>>>>>>
>>>>>>>>>> Is there any chance to improve the latency? Or is the latency perfectly good?
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>> Hub internet address
>>>>>>>>>> 64 bytes from vpn.example.com (217.122.5.6): icmp_seq=1 ttl=57 time=15.2 ms
>>>>>>>>>>
>>>>>>>>>> Internal address of Hub
>>>>>>>>>> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
>>>>>>>>>> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=62 time=40.4 ms
>>>>>>>>>>
>>>>>>>>>> Client on another spoke
>>>>>>>>>> PING 192.168.1.130 (192.168.1.130) 56(84) bytes of data.
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=1 ttl=61 time=108 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=2 ttl=61 time=41.8 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=3 ttl=61 time=38.0 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=4 ttl=61 time=35.2 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=5 ttl=61 time=36.4 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=6 ttl=61 time=39.1 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=7 ttl=61 time=38.1 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=8 ttl=61 time=41.6 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=9 ttl=61 time=36.0 ms
>>>>>>>>>> 64 bytes from 192.168.1.130: icmp_seq=10 ttl=61 time=36.7 ms
>>>>>>>>>>
>>>>>>>>>> --- 192.168.1.130 ping statistics ---
>>>>>>>>>> 10 packets transmitted, 10 received, 0% packet loss, time 9013ms
>>>>>>>>>> rtt min/avg/max/mdev = 35.295/45.159/108.281/21.146 ms
>>>>>>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.strongswan.org/pipermail/users/attachments/20180120/f033d20a/attachment-0001.html>