[strongSwan] Clarifying behaviour around NAT-T and remapping

Rich Lafferty rich at lafferty.ca
Wed Mar 28 18:06:26 CEST 2018


Hi Tobias and list,

I was able to duplicate my issues around NAT remapping in Azure with StrongSwan at both ends, so now Racoon is completely out of the picture.

I captured some more detailed logs of the situation. In the logs below, stg-vault-zk04 is in AWS and has static NAT in front of it (i.e. 500->500, 4500->4500); stg-vault-zk06 (second log) is in Azure, with a less-static NAT service in front of it (it passes 500 and 4500 in to 500 and 4500 on the host, but outgoing traffic on 500/4500 will sometimes be remapped to ports around 1024).

The problem is that sometimes ESP traffic is sent to the IKE port (and possibly vice versa). I’ve ^marked a few places in the logs where things happen that I think are relevant.

One challenge is that we can’t force the remapping to happen, so we’re at the whim of Azure for trying to reproduce this.


Config on stg-vault-zk06: 

connections {
stg-vault-zk04_0 {
  include swanctl-ike.conf
  local_addrs = 10.0.0.59
  remote_addrs = 54.186.244.242
  children {
    stg-vault-zk04_0 {
      include swanctl-ipsec.conf
    }
  }
}
}

On stg-vault-zk04:

connections {
stg-vault-zk06_0 {
  include swanctl-ike.conf
  local_addrs = 172.17.128.86
  remote_addrs = 13.88.23.150
  children {
    stg-vault-zk06_0 {
      include swanctl-ipsec.conf
    }
  }
}
}

swanctl-ike.conf contains this on both:

version = 1
local { auth = psk }
remote { auth = psk }
proposals = aes128-sha256-modp3072
encap = yes
dpd_delay = 20
# This avoids interop issue between strongswan and racoon involving
# fragmented IKE messages immediately after racoon restart. Once our whole
# fleet is StrongSwan we may wish to return this to the default of "yes"
# (or not, since our IKE messages are typically small enough to not require
# fragmentation). See SRE-1203.
fragmentation = no
reauth_time = 23h
over_time = 1h
rand_time = 1h

swanctl-ipsec.conf contains this on both:

mode = transport
start_action = trap
esp_proposals = aes128-sha1-modp3072
life_time = 8h
rekey_time = 7h
rand_time = 1h

As for PSKs: I’m confident that the PSKs are configured correctly because these hosts communicate successfully when port remapping doesn’t get in the way.

After running “swanctl -i --child stg-vault-zk06” on stg-vault-zk04: 

Mar 27 15:47:35 stg-vault-zk04 charon: 15[CFG] vici initiate 'stg-vault-zk06_0'
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_CERT_PRE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_CERT_POST task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_NATD task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing QUICK_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating new tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318>   activating ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318>   activating ISAKMP_CERT_PRE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318>   activating MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318>   activating ISAKMP_CERT_POST task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318>   activating ISAKMP_NATD task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending XAuth vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending DPD vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending NAT-T (RFC 3947) vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending draft-ietf-ipsec-nat-t-ike-02\n vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> initiating Main Mode IKE_SA stg-vault-zk06_0[318] to 13.88.23.150
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> IKE_SA stg-vault-zk06_0[318] state change: CREATED => CONNECTING
Mar 27 15:47:35 stg-vault-zk04 charon: 14[CFG] <stg-vault-zk06_0|318> configured proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 14[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ SA V V V V ]
Mar 27 15:47:35 stg-vault-zk04 charon: 14[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[500] to 13.88.23.150[500] (160 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 07[NET] <stg-vault-zk06_0|318> received packet: from 13.88.23.150[1031] to 172.17.128.86[500] (140 bytes)

^ first non-port-500 packet from zk04's perspective. this says to me that nat port 1031 is mapped to port 500 on stg-vault-zk06

Mar 27 15:47:35 stg-vault-zk04 charon: 07[ENC] <stg-vault-zk06_0|318> parsed ID_PROT response 0 [ SA V V V ]
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received XAuth vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received DPD vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received NAT-T (RFC 3947) vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> selecting proposal:
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318>   proposal matches
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> received proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> configured proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> selected proposal: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> reinitiating already active tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318>   ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318>   MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 07[LIB] <stg-vault-zk06_0|318> size of DH secret exponent: 3071 bits
Mar 27 15:47:35 stg-vault-zk04 charon: 07[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ KE No NAT-D NAT-D ]
Mar 27 15:47:35 stg-vault-zk04 charon: 07[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[500] to 13.88.23.150[1031] (524 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 16[NET] <stg-vault-zk06_0|318> received packet: from 13.88.23.150[1031] to 172.17.128.86[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 16[ENC] <stg-vault-zk06_0|318> parsed ID_PROT response 0 [ KE No NAT-D NAT-D ]
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> local host is behind NAT, sending keep alives
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> remote host is behind NAT
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> reinitiating already active tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318>   ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318>   MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 16[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ ID HASH ]
Mar 27 15:47:35 stg-vault-zk04 charon: 16[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[4500] to 13.88.23.150[1031] (92 bytes)

^ we shouldn't be seeing traffic from 4500 to (1030->500)??? 




logs from stg-vault-zk06:


Mar 27 15:47:35 stg-vault-zk06 charon: 05[NET] <80> received packet: from 54.186.244.242[500] to 10.0.0.59[500] (160 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80> looking for an ike config for 10.0.0.59...54.186.244.242
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80>   candidate: 10.0.0.59...54.186.244.242, prio 3100
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80> found matching ike config: 10.0.0.59...54.186.244.242 with prio 3100
Mar 27 15:47:35 stg-vault-zk06 charon: 05[IKE] <80> 54.186.244.242 is initiating a Main Mode IKE_SA
Mar 27 15:47:35 stg-vault-zk06 charon: 05[NET] <80> sending packet: from 10.0.0.59[500] to 54.186.244.242[500] (140 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[NET] <80> received packet: from 54.186.244.242[500] to 10.0.0.59[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[CFG] <80>   candidate "stg-vault-zk04_0", match: 1/1/3100 (me/other/ike)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[IKE] <80> no shared key found for '10.0.0.59'[10.0.0.59] - '(null)'[54.186.244.242]

^ I don't understand why this is '(null)'

Mar 27 15:47:35 stg-vault-zk06 charon: 10[NET] <80> sending packet: from 10.0.0.59[500] to 54.186.244.242[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes) 

^ I don't know why this packet went from 500 to 4500

Mar 27 15:47:35 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:35 stg-vault-zk06 charon: 15[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]

^ at this point I believe we're getting ESP-in-UDP packets to 1031->500.

Mar 27 15:47:39 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:39 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:39 stg-vault-zk06 charon: 10[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:47:46 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:46 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:46 stg-vault-zk06 charon: 13[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:47:55 stg-vault-zk06 charon: 16[IKE] <80> sending keep alive to 54.186.244.242[500]
Mar 27 15:47:59 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:59 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:59 stg-vault-zk06 charon: 07[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:48:05 stg-vault-zk06 charon: 14[JOB] <80> deleting half open IKE_SA with 54.186.244.242 after timeout
Mar 27 15:48:23 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:48:23 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:48:23 stg-vault-zk06 charon: 08[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:49:05 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:49:05 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:49:05 stg-vault-zk06 charon: 12[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]




> On Mar 23, 2018, at 1:20 PM, Tobias Brunner <tobias at strongswan.org> wrote:
> 
> Hi Rich,
> 
>> 1. IKE and ESP SAs are established normally with NAT-T, i.e. 500:4500.
>> 2. NAT remapping occurs within Azure, at which point StrongSwan sees IKE packets come from port 1027 instead of 500. (i.e. instead of 500:500 it’s 500:1027).
> 
> And what happens to port 4500?  Why would there even be further messages
> from port 500 for existing IKE_SAs?
> 
> If a NAT is detected when the SA is established (even if the port is the
> same, the IP changes so a NAT should be detected) racoon should switch
> to port 4500 with the third Main Mode message and then keep sending
> packets from/to that port (there is no reason to send messages from port
> 500 later, unless a new SA is initiated).
> 
>> 3. StrongSwan observes that the source port for IKE has changed to 1027, and starts sending both IKE and ESP-in-UDP packets to azure:1027.
> 
> Are these DPD messages?  Or what IKE messages trigger this?  Or is this
> actually triggered by ESP-in-UDP messages via a kernel event?  (There is
> one if the kernel notices different UDP ports than configured on the
> installed IPsec SA.)
> 
>> First, am I correctly understanding the behaviour of StrongSwan when NAT remapping occurs? Is this expected StrongSwan behaviour?
> 
> Check the log for what's actually going on.
> 
>> And: If the Azure end was StrongSwan, and ESP-in-UDP packets started arriving at port 500, would StrongSwan process them as ESP?
> 
> No, it's not possible to process ESP-in-UDP packets on port 500 as IKE
> messages to that port are not sent with non-ESP marker and thus would
> get processed as ESP messages.
> 
> Regards,
> Tobias



More information about the Users mailing list