[strongSwan] unable to install policy ... the same policy for reqid XXX exists

Thu Aug 18 15:32:23 CEST 2016

Hi Andreas,

Thanks for the detailed report.  I was able to reproduce the issue.  The
problem is caused by the FWD policies in the outbound direction that are
installed since 5.5.0.  Or rather an incomplete update of the cached
data when adding/removing policies to/from the kernel and a peculiarity
of how reqids are allocated for policies.

In your scenario the Intersite_1 and Intersite2_2 SAs will both install
duplicate FWD policies between 192.168.0.0/16 and 192.168.3.0/24.  If
only one of them is established just the inbound FWD policy has a reqid
associated with it, to permit decrypted traffic.  The outbound FWD
policy acts like a passthrough policy in case there is e.g. a default
block policy.  When both connections are established the two FWD
policies both have a reqid assigned as each one acts as inbound FWD
policy for one of the SAs.

If you then terminate one of them the corresponding inbound FWD policy
will again act as outbound FWD policy for the other SA and is updated
accordingly in the kernel (no reqid/template assigned anymore).
However, there was no update of the cached reqid in the kernel-netlink
plugin, which is used to prevent that a policy is incorrectly used by
more than one SA.

Re-establishing the connection will then again require the update of the
FWD policy with a reqid, however, since it wasn't cleared before there
is a mismatch now with the previous reqid, causing the error message in
the subject.

Since 5.3.0 reqids are generally kept constant for the same policies,
however, this does not apply to the symmetrical FWD policies used in
this scenario.  The Intersite_1 connection has a local traffic selector
(TS) of 192.168.0.0/16 and a remote TS of 192.168.3.0/24, the local TS
for the Intersite2_2 connection is 192.168.3.0/24 and the remote TS is
192.168.0.0/16.  That is, they are not the same and since there is no
conflict for any of the policies (even outbound FWD policies don't
directly conflict as they have no reqid assigned) they yield different
reqids.  But this also means that when one of the connections is
terminated its reqid is released and a new one is allocated when the
connection is reestablished triggering the issue in the first place.

I pushed a fix to the kernel-netlink-reqid-update branch [1].  As a
workaround you could try using the kernel-pfkey plugin, which does not
cache/compare the assigned reqids, or configure unique but static reqids
for the connections.

Regards,
Tobias

[1]
https://git.strongswan.org/?p=strongswan.git;a=shortlog;h=refs/heads/kernel-netlink-reqid-update