[strongSwan] Rekey Collisions

Steve Lee steve.lee at zynstra.com
Tue Jul 29 18:06:51 CEST 2014


Thanks Martin

I checked the configuration settings - in fact they were 

        keylife=20m
        rekeymargin=3m
        rekeyfuzz=100%

As far as I can see this means that rekey event will occur between 14 and 17 minutes. I guess over the course of a day this increases the likelihood of a collision. 
Here are the log entries for the 2 sites for the single second where all the action occurred. The outcome is that we cannot ping through the tunnel.

If you cant see any problem in the log, I'll gather some more data next time it happens. Is there a log level which I should set to get better information?

Regards
Steve



Site 1
Jul 28 10:31:48 zsitegate-01 charon: 02[KNL] creating rekey job for ESP CHILD_SA with SPI cfb451b5 and reqid {1}
Jul 28 10:31:48 zsitegate-01 charon: 02[IKE] establishing CHILD_SA 0000-0129-private-to-0000-0121-private{1}
Jul 28 10:31:48 zsitegate-01 charon: 02[ENC] generating CREATE_CHILD_SA request 16 [ N(REKEY_SA) SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-01 charon: 02[NET] sending packet: from 10.1.41.177[4500] to 212.126.133.206[4500] (348 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 16[NET] received packet: from 212.126.133.206[4500] to 10.1.41.177[4500] (348 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 16[ENC] parsed CREATE_CHILD_SA request 23 [ N(REKEY_SA) SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-01 charon: 16[IKE] CHILD_SA 0000-0129-private-to-0000-0121-private{1} established with SPIs c7ba6228_i c70845f3_o and TS 10.1.41.128/25 === 10.1.33.128/25 
Jul 28 10:31:48 zsitegate-01 charon: 16[IKE] detected CHILD_REKEY collision with CHILD_REKEY
Jul 28 10:31:48 zsitegate-01 charon: 16[ENC] generating CREATE_CHILD_SA response 23 [ SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-01 charon: 16[NET] sending packet: from 10.1.41.177[4500] to 212.126.133.206[4500] (204 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 15[NET] received packet: from 212.126.133.206[4500] to 10.1.41.177[4500] (204 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 15[ENC] parsed CREATE_CHILD_SA response 16 [ SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-01 charon: 15[IKE] CHILD_SA 0000-0129-private-to-0000-0121-private{1} established with SPIs c738e8d4_i cea4e8aa_o and TS 10.1.41.128/25 === 10.1.33.128/25 
Jul 28 10:31:48 zsitegate-01 charon: 15[IKE] CHILD_SA rekey collision lost, deleting rekeyed child
Jul 28 10:31:48 zsitegate-01 charon: 15[IKE] closing CHILD_SA 0000-0129-private-to-0000-0121-private{1} with SPIs c738e8d4_i (0 bytes) cea4e8aa_o (0 bytes) and TS 10.1.41.128/25 === 10.1.33.128/25 c7ba6228
Jul 28 10:31:48 zsitegate-01 charon: 15[IKE] sending DELETE for ESP CHILD_SA with SPI c738e8d4
Jul 28 10:31:48 zsitegate-01 charon: 15[ENC] generating INFORMATIONAL request 17 [ D ]
Jul 28 10:31:48 zsitegate-01 charon: 15[NET] sending packet: from 10.1.41.177[4500] to 212.126.133.206[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 13[NET] received packet: from 212.126.133.206[4500] to 10.1.41.177[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 13[ENC] parsed INFORMATIONAL request 24 [ D ]
Jul 28 10:31:48 zsitegate-01 charon: 13[IKE] received DELETE for ESP CHILD_SA with SPI cc55cd29
Jul 28 10:31:48 zsitegate-01 charon: 13[IKE] closing CHILD_SA 0000-0129-private-to-0000-0121-private{1} with SPIs cfb451b5_i (191384 bytes) cc55cd29_o (215176 bytes) and TS 10.1.41.128/25 === 10.1.33.128/25 
Jul 28 10:31:48 zsitegate-01 charon: 13[IKE] sending DELETE for ESP CHILD_SA with SPI cfb451b5
Jul 28 10:31:48 zsitegate-01 charon: 13[IKE] CHILD_SA closed
Jul 28 10:31:48 zsitegate-01 charon: 13[IKE] detected CHILD_REKEY collision with CHILD_DELETE
Jul 28 10:31:48 zsitegate-01 charon: 13[ENC] generating INFORMATIONAL response 24 [ D ]
Jul 28 10:31:48 zsitegate-01 charon: 13[NET] sending packet: from 10.1.41.177[4500] to 212.126.133.206[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 01[NET] received packet: from 212.126.133.206[4500] to 10.1.41.177[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-01 charon: 01[ENC] parsed INFORMATIONAL response c7ba622817 [ D ]
Jul 28 10:31:48 zsitegate-01 charon: 01[IKE] received DELETE for ESP CHILD_SA with SPI cea4e8aa
Jul 28 10:31:48 zsitegate-01 charon: 01[IKE] CHILD_SA closed
Jul 28 10:31:48 zsitegate-01 vpn: - CN=intersite-customer-0000-0121, OU=HAP Customers 10.1.33.128/25 == 212.126.133.206 -- 10.1.41.177 == 10.1.41.128/25

Site2
Jul 28 10:31:48 zsitegate-00 charon: 06[KNL] creating rekey job for ESP CHILD_SA with SPI cfb451b5 and reqid {1}
Jul 28 10:31:48 zsitegate-00 charon: 06[IKE] establishing CHILD_SA 0000-0121-private-to-0000-0129-private{1}
Jul 28 10:31:48 zsitegate-00 charon: 06[ENC] generating CREATE_CHILD_SA request 23 [ N(REKEY_SA) SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-00 charon: 06[NET] sending packet: from 10.1.33.177[4500] to 85.13.227.242[4500] (348 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 14[NET] received packet: from 85.13.227.242[4500] to 10.1.33.177[4500] (348 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 14[ENC] parsed CREATE_CHILD_SA request 16 [ N(REKEY_SA) SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-00 charon: 14[IKE] CHILD_SA 0000-0121-private-to-0000-0129-private{1} established with SPIs cea4e8aa_i c738e8d4_o and TS 10.1.33.128/25 === 10.1.41.128/25 
Jul 28 10:31:48 zsitegate-00 charon: 14[IKE] detected CHILD_REKEY collision with CHILD_REKEY
Jul 28 10:31:48 zsitegate-00 charon: 14[ENC] generating CREATE_CHILD_SA response 16 [ SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-00 charon: 14[NET] sending packet: from 10.1.33.177[4500] to 85.13.227.242[4500] (204 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 13[NET] received packet: from 85.13.227.242[4500] to 10.1.33.177[4500] (204 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 13[ENC] parsed CREATE_CHILD_SA response 23 [ SA No TSi TSr ]
Jul 28 10:31:48 zsitegate-00 charon: 13[IKE] CHILD_SA 0000-0121-private-to-0000-0129-private{1} established with SPIs c70845f3_i c7ba6228_o and TS 10.1.33.128/25 === 10.1.41.128/25 
Jul 28 10:31:48 zsitegate-00 charon: 13[IKE] CHILD_SA rekey collision won, deleting old child
Jul 28 10:31:48 zsitegate-00 charon: 13[IKE] closing CHILD_SA 0000-0121-private-to-0000-0129-private{1} with SPIs cc55cd29_i (213705 bytes) cfb451b5_o (193578 bytes) and TS 10.1.33.128/25 === 10.1.41.128/25 
Jul 28 10:31:48 zsitegate-00 charon: 13[IKE] sending DELETE for ESP CHILD_SA with SPI cc55cd29
Jul 28 10:31:48 zsitegate-00 charon: 13[ENC] generating INFORMATIONAL request 24 [ D ]
Jul 28 10:31:48 zsitegate-00 charon: 13[NET] sending packet: from 10.1.33.177[4500] to 85.13.227.242[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 01[NET] received packet: from 85.13.227.242[4500] to 10.1.33.177[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 01[ENC] parsed INFORMATIONAL request 17 [ D ]
Jul 28 10:31:48 zsitegate-00 charon: 01[IKE] received DELETE for ESP CHILD_SA with SPI c738e8d4
Jul 28 10:31:48 zsitegate-00 charon: 01[IKE] closing CHILD_SA 0000-0121-private-to-0000-0129-private{1} with SPIs cea4e8aa_i (0 bytes) c738e8d4_o (0 bytes) and TS 10.1.33.128/25 === 10.1.41.128/25 
Jul 28 10:31:48 zsitegate-00 charon: 01[IKE] sending DELETE for ESP CHILD_SA with SPI cea4e8aa
Jul 28 10:31:48 zsitegate-00 charon: 01[IKE] CHILD_SA closed
Jul 28 10:31:48 zsitegate-00 vpn: - CN=intersite-customer-0000-0121, OU=HAP Customers 10.1.41.128/25 == 85.13.227.242 -- 10.1.33.177 == 10.1.33.128/25
Jul 28 10:31:48 zsitegate-00 charon: 01[ENC] generating INFORMATIONAL response 17 [ D ]
Jul 28 10:31:48 zsitegate-00 charon: 01[NET] sending packet: from 10.1.33.177[4500] to 85.13.227.242[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 02[NET] received packet: from 85.13.227.242[4500] to 10.1.33.177[4500] (76 bytes)
Jul 28 10:31:48 zsitegate-00 charon: 02[ENC] parsed INFORMATIONAL response 24 [ D ]
Jul 28 10:31:48 zsitegate-00 charon: 02[IKE] received DELETE for ESP CHILD_SA with SPI cfb451b5
Jul 28 10:31:48 zsitegate-00 charon: 02[IKE] CHILD_SA closed




________________________________________
From: Martin Willi <martin at strongswan.org>
Sent: 29 July 2014 13:33
To: Steve Lee
Cc: users at lists.strongswan.org
Subject: Re: [strongSwan] Rekey Collisions

Hi Steve,

> one or more tunnels seem to stop working

What does that exactly mean? What IKE version are you using?

> After some investigation, it seemed that these coincide with a rekey
> collision where both sides create a rekey jobs at (to the nearest
> second) the same time. When this happens I dont see any specific errors
> in the logs.

If charon detects rekey collisions, it should log that fact. But there
are many different collision scenarios; an excerpt from your log could
certainly help to analyze the issue.

> 1. Why do we keep seeing the collisions, surely the rekeyfuzz would
> make this pretty unlikely or does the way the host were built and/or
> time sync affect the randomness of rekeyfuzz?

Your system time should not have any effect; on most systems charon does
not use the system time anyway to schedule such events.
        keylife=20m
        rekeymargin=3m
        keyingtries=%forever
        rekeyfuzz=100%

With your rekeymargin of 9m and 100% fuzz, collisions should be in fact
be very (very) rare. If this is reproducible, something is seriously
wrong.

For these non-cryptographic operations, charon relies on a getpid() +
time() initialized random() calls. Not sure how your Hypervisor handles
that?

> 2. When we get a collision why dont we see an error and why doesnt it
> retry given the keyingtries parameter?

keyingtries has no effect when handling rekey collisions. I think with
5.1.1 these collisions should be handled properly.

> 3. Is it recommended that only one side should do rekeying (i.e. set
> rekey=no on the other)?

Usually it is not required, as with a sane configuration collisions are
unlikely, and even if they happen should be handled gracefully, at least
between two strongSwan hosts.

Regards
Martin


More information about the Users mailing list