[strongSwan] High availability failover problem

Martin Willi martin at strongswan.org
Tue Mar 10 14:22:17 CET 2015


Aleksey,

> when I test failover [...], traffic won't flow through standby
> node until rekey on child SA is done

To me this sound like an ESP sequence number issue. I assume you have
patched your kernel to include our ClusterIP IPsec extensions, as
discussed at [1]. You may find some never patches in the ha-*
tags/branches at [2].

Then you should check if ClusterIP works as expected, and both on the
inbound and outbound paths the ESP packets hit both nodes. If this is
the case, ClusterIP can keep ESP sequence numbers in sync on the passive
node.

If that all works as expected, try to compare the sequence numbers
before and after failover. Linux drops packets with an already used
sequence number silently, but /proc/net/xfrm_stats (requires
CONFIG_XFRM_STATISTICS) has some counters that can help in analyzing why
packets get dropped.

Regards
Martin

[1]https://wiki.strongswan.org/projects/strongswan/wiki/HighAvailability
[2]http://git.strongswan.org/?p=linux-dumm.git;a=summary



More information about the Users mailing list