<div dir="ltr"><div>I am having an issue where multiple child SA's are created for one IKE SA. I have seen this case exist, and both the server and the end point are using the same SA's and everything is ok. Then there are times when my tunnel goes down, and the server and endpoint are both sending with different SA's, and no traffic passed between the servers.</div>
<div><br></div><div><br></div><div>I ran ipsecstatusall while it was down and this is what is displayed:</div><div><br></div><div><div>server1[16]: ESTABLISHED 56 minutes ago, 1.1.1.1[1.1.1.1]...2.2.2.2[2.2.2.2]</div><div>
server1[16]: IKEv2 SPIs: 6879194497057724_i* e2db0981a5da0077_r, pre-shared key reauthentication in 100 minutes</div><div> server1[16]: IKE proposal: AES_CBC_256/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072</div>
<div> server1{13}: INSTALLED, TUNNEL, ESP SPIs: c1933ff3_i c28aff87_o</div><div> server1{13}: AES_GCM_16_256, 4894 bytes_i (33 pkts, 740s ago), 66728 bytes_o (676 pkts, 1s ago), rekeying in 31 minutes</div>
<div> server1{13}: <a href="http://10.220.0.53/32">10.220.0.53/32</a> === <a href="http://10.220.0.54/32">10.220.0.54/32</a></div><div> server1{12}: INSTALLED, TUNNEL, ESP SPIs: c2096503_i c5883dc9_o</div>
<div> server1{12}: AES_GCM_16_256, 19492 bytes_i (126 pkts, 3s ago), 76 bytes_o (1 pkt, 349s ago), rekeying in 34 minutes</div><div> server1{12}: <a href="http://10.220.0.53/32">10.220.0.53/32</a> === <a href="http://10.220.0.54/32">10.220.0.54/32</a></div>
</div><div><br></div><div><br></div><div>Here you can see the SPI c1933ff3_i was used 740s ago, while the c28aff87_o was used 1 second ago. So the server (which this status was from) hasn't seen any inbound packets on the {13} Child SA, but has been talking out of it. Conversely the {12} SA shows inbound packets 3s ago, while the outbound channel is 349s old. So I have 2 Child SA's for my IKE SA, and it is sending from one and receiving on the other. As I understand it, this condition shouldn't exist and it never does in my test environment. The only difference I can determine between test and production is that the test environment has very low latency between server and client, while the production could have 100+ms latency to the other end. </div>
<div><br></div><div>Am I hitting some race condition here? Can anyone help shed some light on this for me?</div><div><br></div><div><br></div><div>If It helps, these are my settings for the tunnel.</div><div><br></div><div>
<div>conn server1</div><div> keyexchange=ikev2</div><div> authby=secret</div><div> left=%defaultroute</div><div> leftsubnet=<a href="http://10.220.0.53/32">10.220.0.53/32</a></div><div> right=2.2.2.2</div>
<div> rightsubnet=<a href="http://10.220.0.54/32">10.220.0.54/32</a></div><div> auto=route</div><div> dpdaction=clear</div><div> ike=aes256-sha256-modp3072!</div><div> esp=aes256gcm16-modp3072!</div>
</div><div><br></div><div> </div><div><div>conn core</div><div> keyexchange=ikev2</div><div> authby=secret</div><div> left=%defaultroute</div><div> leftsubnet=<a href="http://10.220.0.54/32">10.220.0.54/32</a></div>
<div> right=1.1.1.1</div><div> rightsubnet=<a href="http://10.220.0.53/32">10.220.0.53/32</a></div><div> auto=add</div><div> dpdaction=clear</div><div> rekey=no</div><div> ike=aes256-sha256-modp3072!</div>
<div> esp=aes256gcm16-modp3072!</div></div><div><br></div><div><br></div><div>As you can see I have gone as far as telling the remote machine to not even initiate rekeying (it will respond to a request though) in an attempt to stop this issue from taking my network down.</div>
<div><br></div></div>