[strongSwan] Rekey rejects cause issues

Thu Apr 24 09:02:37 CEST 2014

Hello,
  I observe this pattern on my qnx box with strongswan 4.2.8

  *1.       *Rekey is initiated from Juniper NE side which is ignored by my
box

Jul 02 09:53:15    3    10     0 H is initiating an IKE_SA

Jul 02 09:53:15    3    10     0 H is initiating an IKE_SA

Jul 02 09:53:15    3    10     0 received proposals inacceptable

Jul 02 09:53:15    3    10     0 received proposals inacceptable

*2.       *Juniper NE decided to establish a new connection from the
beginning.

Jul 03 04:12:06    3    10     0 H is initiating an IKE_SA

Jul 03 04:12:06    3    10     0 H is initiating an IKE_SA

Jul 03 04:12:06    3    10     0 H is initiating an IKE_SA

Jul 03 04:12:06    3    10     0 H is initiating an IKE_SA

Jul 03 04:12:36    3    10     0 H is initiating an IKE_SA

Jul 03 04:12:36    3    10     0 H is initiating an IKE_SA

*3.    The  system is not able to allocate new spi.  *When Juniper is
initiating the SA Rekey, it sometimes (at random interval) does not send
clean-up of Old SAs.This Juniper Issue causes the SAD Table(maintained in
Fast Path/uCode) in my box to consume both the slots available for SAD
Entry per plane. Subsequently the Next Rekey request from Juniper/BTS is
rejected as it is not able to add entries in SAD Table. So, the Juniper's
failure to clean-up old SA causes the overall size of SAD table to
increase.Now, this SAD Table is being polled by us at regular interval to
calculate the Rekey and expiry of SA. However. the current framework  is
not designed to handle more than 1 entry per plane and hence due to
internal failures it is not able to respond to messages coming from IKEv2
Stack. This causes the Stacks working threads to get stuck as it expects
response to its every message sent to Host. Thereafter any Requests that
come from Peer Device (here Juniper) is not responded to by stack and the
site goes into non-recoverable condition. During this time, it was also
observed that Juniper is sending continuous IKE Init messages towards BTS.
However due to this issue as Stack is not able to process these messages
completely, its table of Half open connections starts growing which causes
memory leaks and eventually consumes the entire RAM.

What is the possible fix for this issue.. we have temporarily increased
thetable size to accomodate more SAD entries. How do we make sure cleanup
is initiated by the peer when asymmetric rekey is used? Should we use
reauth instead? We are using only three tunnels.

Regards,

    Maria
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.strongswan.org/pipermail/users/attachments/20140424/14892115/attachment-0001.html>