[strongSwan-dev] [PATCH 0/2] Add reference counting to child_sa
Martin Willi
martin at strongswan.org
Tue May 18 10:28:55 CEST 2010
Hi Thomas,
Thanks for your detailed report.
> Now I've seen it happening (only twice so far and not reliably
> reproducable) that charon crashes in child_rekey.c:303.
I tried to reproduce the your crash, but it is indeed a little
difficult. The usual rekey collision message flow chart looks like:
box xob
crq < > crq
\ /
\/
/\
/ \
crs < > crs
\ /
\/
/\
/ \
detect < > detect
where:
<>: Packet send/receive/processing
crq: CHILD_SA_CREATE request
crs: CHILD_SA_CREATE response
drq: DELETE request
detect: Collision detect
This simple collision is handled properly.
According to your log, you're seeing the following:
box xob
crq < > crq
\ /
\ /
\/
/\
/ \
/ \
crs < > crs
\ /
\ /
\/
/\
| \
| \
| > drq
| /
| /
|/
/
/|
/ |
< |
/
/
XXX <
where one of the CHILD_SA_CREATE responses is delayed for a longer time.
When it finally arrives, the CHILD_SA is gone, resulting in the crash
you have seen (XXX).
Our send/receive_delay testing hooks are insufficient to simulate this,
I'll have to add some conditional packet delays to reproduce it.
> I've prepared a quickly hacked patch that extends child_sa with a
> get_ref function and increases this refcount in ike_sa->get_child_sa.
> I'm not sure about any unwanted side effects, or if this is a very good
> solution. Please take a look and let me know
Refcounting is an option, but it is a concept not yet used in the
handling of CHILD_SAs. I'll try to reproduce this condition, maybe we'll
find a simpler fix. I think it should be possible to catch this in the
collide() check.
Best regards
Martin
More information about the Dev
mailing list