[strongSwan-dev] reqid handling

Wed Sep 3 16:20:54 CEST 2014

Hi,

I recently had a discussion on #strongswan how SPD/SA reqid is used for
various lookups inside strongSwan code. This seems to have few rather
unpleasant implications - especially for my dmvpn work.

Technically, in kernel the reqid is specified in SPD, and used to
filter which SA is selected. This means that it's perfectly ok
for multiple SPDs to have same reqid and share SAs. It is also not
reverse mappable as multiple SAs can have same reqid but there can be
still unique or non-unique mapping back to SPDs which may use the SA.

Now, it seems that the strongSwan internally assumes that reqid
uniquely identifies SPD and an IKE_SA. While this more or less holds
true for all current ways one can define connection entries, it is not
true with wildcard transport mode SPDs. It might be that the reqid is
also used to uniquely identify a CHILD_SA which is not true during
rekeying (though, it might be that SPI or additional filtering is used
in all code paths - I did not thoroughly check this).

The other problem is that IKE_SA lookup by reqid is O(n) operation.
This means that rekeying, SA deletion, SA address update (NAT binding
change), and various other operations can are slow with large amount of
SAs. Since this affects rekeying, the overall system algorithmic
complexity is O(n). Where n is num-ike-sa + num-child-sa. This is bad.

I'm wondering if it would be feasibly to delete
ike_sa_manager->checkout_by_id method, and replace it with lookups that
are fast and known to be unique. This probably means one or two
additional hash tables.

There seems to be about eight places where it is checkout_by_id() is
used. Most commonly to lookup CHILD_SA based on kernel notification.
This should probably be replaced by hash based SA
dst+dstport+protocol+spi. Most other code paths seem to be already
enumerating the IKE_SA/CHILD_SA and just passing the matching SAs reqid
to async job to look it up again. That could be replaced by having
internal, real 'unique id'.

The last place not matching above criteria seems to be inactivity_job,
which could be rewritten to be IKE_SA based. It would probably workout
nicely and reduce the amount of inactivity_jobs.

Does this sounds like a feasible plan? Any alternative ideas?

Thanks for considering,
Timo