[strongSwan] Dynamic client hangs up randomly.

Gary Smith gary.smith at holdstead.com
Fri Mar 11 17:49:45 CET 2011


> I use the work randomly, but I think it was around midnight this time.
> It happened other times during the day last week.
> 
> This is my home office that had connections to all 3 segments of the 3
> segment ipsec cloud. The cloud itself is working flawlessly now (after
> migrating the final openswan to strongSwan). On the home office when I
> do an ipsec start, the tunnels come up just fine. At some point they
> can no longer talk to the 3 segment vpn servers and it just stops. I'm
> not sure why. Last night it appared to happen at around midnight, so I
> thought I'd look at that as a possible trigger. The home office is on a
> dynamic IP which hasn't changed in several months (since I logged it
> last -- maybe a year+).
> 
> Here is the dump from the log file where it actually dies:
> 
> Mar  2 00:03:20 charon: 03[KNL] creating rekey job for ESP CHILD_SA
> with SPI ca7282eb and reqid {5}
> Mar  2 00:03:20 charon: 06[IKE] establishing CHILD_SA fre-ben{5}
> Mar  2 00:03:20 charon: 06[IKE] establishing CHILD_SA fre-ben{5}
> Mar  2 00:03:20 charon: 06[ENC] generating CREATE_CHILD_SA request 4 [
> N(REKEY_SA) SA No TSi TSr ]
> ... First sending/retrans happens right after rekey 00:03:20
> Mar  2 00:04:49 hsbenfiw01 charon: 13[IKE] retransmit 5 of request with
> message ID 4
> Mar  2 00:04:49 hsbenfiw01 charon: 13[NET] sending packet: from
> HOMEOFFICE[500] to REMOTENETWORK[500]
> Mar  2 00:06:05 charon: 03[KNL] creating delete job for ESP CHILD_SA
> with SPI ccdb20b0 and reqid {5}
> Mar  2 00:06:05 charon: 12[IKE] giving up after 5 retransmits
> Mar  2 00:06:05 vpn: - ...
> Mar  2 00:06:05 charon: 12[KNL] received netlink error: No such process
> (3)
> Mar  2 00:06:05 charon: 12[KNL] unable to delete SAD entry with SPI
> ccdb20b0
> 
> What's my best course at this time?

I added rekey=no to the connection entries on the static side and it seems to be staying up. It seems that when a rekey is issued from there the dynamic side hangs up and doesn't re-establish the connection. Anyway, it's mostly working now. The one of the nodes has fallen down twice since I added the rekey (when I say fallen down, lost connection and not re-established). I setup a cronjob to check the status of each individual node and issue a "ipsec up node" on the dynamic side but that's a bit of a hack at best.

I understand how the rekey could kill the connection but I still don't really understand why the dynamic side doesn't try to renegotiate the failed connection. Is there a way to make this happen?




More information about the Users mailing list