[strongSwan-dev] [PATCH 0/2] Add reference counting to child_sa

Martin Willi martin at strongswan.org
Tue May 18 10:28:55 CEST 2010


Hi Thomas,

Thanks for your detailed report.

> Now I've seen it happening (only twice so far and not reliably
> reproducable) that charon crashes in child_rekey.c:303.

I tried to reproduce the your crash, but it is indeed a little
difficult. The usual rekey collision message flow chart looks like:

        box    xob
   crq    <    >    crq
           \  / 
            \/
            /\
           /  \
   crs    <    >    crs
           \  / 
            \/
            /\
           /  \ 
detect    <    >    detect

where:
<>: Packet send/receive/processing
crq: CHILD_SA_CREATE request
crs: CHILD_SA_CREATE response
drq: DELETE request
detect: Collision detect

This simple collision is handled properly.

According to your log, you're seeing the following:


        box    xob
   crq   <      >    crq
          \    /
           \  /
            \/
            /\
           /  \
          /    \
   crs   <      >    crs
          \    /
           \  /     
            \/ 
            /\ 
            | \
            |  \
            |   >    drq
            |  /
            | /     
            |/
            / 
           /|  
          / |    
         <  |    
           /
          /
    XXX  <

where one of the CHILD_SA_CREATE responses is delayed for a longer time.
When it finally arrives, the CHILD_SA is gone, resulting in the crash
you have seen (XXX).

Our send/receive_delay testing hooks are insufficient to simulate this,
I'll have to add some conditional packet delays to reproduce it.

> I've prepared a quickly hacked patch that extends child_sa with a
> get_ref function and increases this refcount in ike_sa->get_child_sa.
> I'm not sure about any unwanted side effects, or if this is a very good
> solution. Please take a look and let me know

Refcounting is an option, but it is a concept not yet used in the
handling of CHILD_SAs. I'll try to reproduce this condition, maybe we'll
find a simpler fix. I think it should be possible to catch this in the
collide() check.

Best regards
Martin





More information about the Dev mailing list