[strongSwan-dev] [PATCH 0/5] Recover IKE_SA reset after successful IKE_SA_INIT

Thomas Egerer thomas.egerer at secunet.com
Sun Jun 6 22:47:56 CEST 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello Martin, *,



after one of our test scenarios repeatedly failed we took the time to
investigate the problem.
It was a quite tough nut to crack, but I guess this might be something
that should best be fixed upstream (again I'm using box and xob, where
box is I and xob is R):

         box    xob
   iirq    >
            \
             \
              \
               \
                >    iirs
               /
              /
             /
            /
           <
    iarq 0 >
            \
             \
              \
    iarq 1 >
            \
             \
              \
    iarq 2 >
            \
             \
              \
    iarq 3 >
            \
             \
              \
    iarq 4 >
            \
             \
              \
    iarq 5 >
            \
             \
              \
(reset of IKE_SA on box)
If after this has happened, traffic destined to the aforementioned
IKE_SA is received it's task_manager gets stuck.
This is because the IKE_SA_INIT and NAT_D tasks were already
successfully exchanged and then the IKE_SA is reset to IKE_CREATED.
When the task_manager's initiate function is called and -- even though
there are queued tasks -- none of the tasks is activated.
If however the connection is manually initiated everything works again
since stroke causes charon to add a new IKE_SA_INIT task.
As I've mentioned, it took us a while to figure this out. This was
mainly for two reasons:
a) the checkout/checkin information in ike_sa_manager lack the info
   which IKE_SA (including the unique ID) is checked out,
b) there's no way to see the pending/queued or passive tasks in an
   IKE_SA (actually there is, but not upstream, yet)
a became a problem when we did not realize that we actually had two
instances of the same IKE_SA (uniqueness policy didn't kick in, since
the IKE_SA that got stuck was still in state IKE_CREATED). So what we
saw was basically an acquire that seemed to vanish when it was queued
into an IKE_SA that we didn't know was there in first place (reading
stroke output on a 80x25 terminal can be a challenge).
And b became a problem when we wanted to see where our 'lost'
CHILD_CREATE job was queued and where not.
So the patches to come include
* the helpful debug output for a),
* an additional printf_hook addressing b),
* some optimization for the use of the newly introduced hook in
  stroke_list,
* the actual use of the hook in stroke_list, and last but not least
* my proposal for a solution of the problem mentioned above

Hope to hear from you

Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwMCXwACgkQDXd94wpQmdwHfQCfSII5t64yrJ2P23JudKbZM0Bg
DF4AoInwoaaibDlwU2amIgS5Up4Vv3RN
=UWxo
-----END PGP SIGNATURE-----




More information about the Dev mailing list