[strongSwan] Clarifying behaviour around NAT-T and remapping
Rich Lafferty
rich at lafferty.ca
Wed Mar 28 18:06:26 CEST 2018
Hi Tobias and list,
I was able to duplicate my issues around NAT remapping in Azure with StrongSwan at both ends, so now Racoon is completely out of the picture.
I captured some more detailed logs of the situation. In the logs below, stg-vault-zk04 is in AWS and has static NAT in front of it (i.e. 500->500, 4500->4500); stg-vault-zk06 (second log) is in Azure, with a less-static NAT service in front of it (it passes 500 and 4500 in to 500 and 4500 on the host, but outgoing traffic on 500/4500 will sometimes be remapped to ports around 1024).
The problem is that sometimes ESP traffic is sent to the IKE port (and possibly vice versa). I’ve ^marked a few places in the logs where things happen that I think are relevant.
One challenge is that we can’t force the remapping to happen, so we’re at the whim of Azure for trying to reproduce this.
Config on stg-vault-zk06:
connections {
stg-vault-zk04_0 {
include swanctl-ike.conf
local_addrs = 10.0.0.59
remote_addrs = 54.186.244.242
children {
stg-vault-zk04_0 {
include swanctl-ipsec.conf
}
}
}
}
On stg-vault-zk04:
connections {
stg-vault-zk06_0 {
include swanctl-ike.conf
local_addrs = 172.17.128.86
remote_addrs = 13.88.23.150
children {
stg-vault-zk06_0 {
include swanctl-ipsec.conf
}
}
}
}
swanctl-ike.conf contains this on both:
version = 1
local { auth = psk }
remote { auth = psk }
proposals = aes128-sha256-modp3072
encap = yes
dpd_delay = 20
# This avoids interop issue between strongswan and racoon involving
# fragmented IKE messages immediately after racoon restart. Once our whole
# fleet is StrongSwan we may wish to return this to the default of "yes"
# (or not, since our IKE messages are typically small enough to not require
# fragmentation). See SRE-1203.
fragmentation = no
reauth_time = 23h
over_time = 1h
rand_time = 1h
swanctl-ipsec.conf contains this on both:
mode = transport
start_action = trap
esp_proposals = aes128-sha1-modp3072
life_time = 8h
rekey_time = 7h
rand_time = 1h
As for PSKs: I’m confident that the PSKs are configured correctly because these hosts communicate successfully when port remapping doesn’t get in the way.
After running “swanctl -i --child stg-vault-zk06” on stg-vault-zk04:
Mar 27 15:47:35 stg-vault-zk04 charon: 15[CFG] vici initiate 'stg-vault-zk06_0'
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_CERT_PRE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_CERT_POST task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing ISAKMP_NATD task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> queueing QUICK_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating new tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating ISAKMP_CERT_PRE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating ISAKMP_CERT_POST task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> activating ISAKMP_NATD task
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending XAuth vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending DPD vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending NAT-T (RFC 3947) vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> sending draft-ietf-ipsec-nat-t-ike-02\n vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> initiating Main Mode IKE_SA stg-vault-zk06_0[318] to 13.88.23.150
Mar 27 15:47:35 stg-vault-zk04 charon: 14[IKE] <stg-vault-zk06_0|318> IKE_SA stg-vault-zk06_0[318] state change: CREATED => CONNECTING
Mar 27 15:47:35 stg-vault-zk04 charon: 14[CFG] <stg-vault-zk06_0|318> configured proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 14[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ SA V V V V ]
Mar 27 15:47:35 stg-vault-zk04 charon: 14[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[500] to 13.88.23.150[500] (160 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 07[NET] <stg-vault-zk06_0|318> received packet: from 13.88.23.150[1031] to 172.17.128.86[500] (140 bytes)
^ first non-port-500 packet from zk04's perspective. this says to me that nat port 1031 is mapped to port 500 on stg-vault-zk06
Mar 27 15:47:35 stg-vault-zk04 charon: 07[ENC] <stg-vault-zk06_0|318> parsed ID_PROT response 0 [ SA V V V ]
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received XAuth vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received DPD vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> received NAT-T (RFC 3947) vendor ID
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> selecting proposal:
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> proposal matches
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> received proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> configured proposals: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[CFG] <stg-vault-zk06_0|318> selected proposal: IKE:AES_CBC_128/HMAC_SHA2_256_128/PRF_HMAC_SHA2_256/MODP_3072
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> reinitiating already active tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 07[IKE] <stg-vault-zk06_0|318> MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 07[LIB] <stg-vault-zk06_0|318> size of DH secret exponent: 3071 bits
Mar 27 15:47:35 stg-vault-zk04 charon: 07[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ KE No NAT-D NAT-D ]
Mar 27 15:47:35 stg-vault-zk04 charon: 07[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[500] to 13.88.23.150[1031] (524 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 16[NET] <stg-vault-zk06_0|318> received packet: from 13.88.23.150[1031] to 172.17.128.86[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk04 charon: 16[ENC] <stg-vault-zk06_0|318> parsed ID_PROT response 0 [ KE No NAT-D NAT-D ]
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> local host is behind NAT, sending keep alives
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> remote host is behind NAT
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> reinitiating already active tasks
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> ISAKMP_VENDOR task
Mar 27 15:47:35 stg-vault-zk04 charon: 16[IKE] <stg-vault-zk06_0|318> MAIN_MODE task
Mar 27 15:47:35 stg-vault-zk04 charon: 16[ENC] <stg-vault-zk06_0|318> generating ID_PROT request 0 [ ID HASH ]
Mar 27 15:47:35 stg-vault-zk04 charon: 16[NET] <stg-vault-zk06_0|318> sending packet: from 172.17.128.86[4500] to 13.88.23.150[1031] (92 bytes)
^ we shouldn't be seeing traffic from 4500 to (1030->500)???
logs from stg-vault-zk06:
Mar 27 15:47:35 stg-vault-zk06 charon: 05[NET] <80> received packet: from 54.186.244.242[500] to 10.0.0.59[500] (160 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80> looking for an ike config for 10.0.0.59...54.186.244.242
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80> candidate: 10.0.0.59...54.186.244.242, prio 3100
Mar 27 15:47:35 stg-vault-zk06 charon: 05[CFG] <80> found matching ike config: 10.0.0.59...54.186.244.242 with prio 3100
Mar 27 15:47:35 stg-vault-zk06 charon: 05[IKE] <80> 54.186.244.242 is initiating a Main Mode IKE_SA
Mar 27 15:47:35 stg-vault-zk06 charon: 05[NET] <80> sending packet: from 10.0.0.59[500] to 54.186.244.242[500] (140 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[NET] <80> received packet: from 54.186.244.242[500] to 10.0.0.59[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[CFG] <80> candidate "stg-vault-zk04_0", match: 1/1/3100 (me/other/ike)
Mar 27 15:47:35 stg-vault-zk06 charon: 10[IKE] <80> no shared key found for '10.0.0.59'[10.0.0.59] - '(null)'[54.186.244.242]
^ I don't understand why this is '(null)'
Mar 27 15:47:35 stg-vault-zk06 charon: 10[NET] <80> sending packet: from 10.0.0.59[500] to 54.186.244.242[500] (524 bytes)
Mar 27 15:47:35 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
^ I don't know why this packet went from 500 to 4500
Mar 27 15:47:35 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:35 stg-vault-zk06 charon: 15[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
^ at this point I believe we're getting ESP-in-UDP packets to 1031->500.
Mar 27 15:47:39 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:39 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:39 stg-vault-zk06 charon: 10[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:47:46 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:46 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:46 stg-vault-zk06 charon: 13[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:47:55 stg-vault-zk06 charon: 16[IKE] <80> sending keep alive to 54.186.244.242[500]
Mar 27 15:47:59 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:47:59 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:47:59 stg-vault-zk06 charon: 07[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:48:05 stg-vault-zk06 charon: 14[JOB] <80> deleting half open IKE_SA with 54.186.244.242 after timeout
Mar 27 15:48:23 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:48:23 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:48:23 stg-vault-zk06 charon: 08[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
Mar 27 15:49:05 stg-vault-zk06 charon: 03[NET] sending packet: from 10.0.0.59[500] to 54.186.244.242[4500] (36 bytes)
Mar 27 15:49:05 stg-vault-zk06 charon: 03[NET] received unsupported IKE version 8.14 from 54.186.244.242, sending INVALID_MAJOR_VERSION
Mar 27 15:49:05 stg-vault-zk06 charon: 12[NET] received unencrypted informational: from 54.186.244.242[4500] to 10.0.0.59[500]
> On Mar 23, 2018, at 1:20 PM, Tobias Brunner <tobias at strongswan.org> wrote:
>
> Hi Rich,
>
>> 1. IKE and ESP SAs are established normally with NAT-T, i.e. 500:4500.
>> 2. NAT remapping occurs within Azure, at which point StrongSwan sees IKE packets come from port 1027 instead of 500. (i.e. instead of 500:500 it’s 500:1027).
>
> And what happens to port 4500? Why would there even be further messages
> from port 500 for existing IKE_SAs?
>
> If a NAT is detected when the SA is established (even if the port is the
> same, the IP changes so a NAT should be detected) racoon should switch
> to port 4500 with the third Main Mode message and then keep sending
> packets from/to that port (there is no reason to send messages from port
> 500 later, unless a new SA is initiated).
>
>> 3. StrongSwan observes that the source port for IKE has changed to 1027, and starts sending both IKE and ESP-in-UDP packets to azure:1027.
>
> Are these DPD messages? Or what IKE messages trigger this? Or is this
> actually triggered by ESP-in-UDP messages via a kernel event? (There is
> one if the kernel notices different UDP ports than configured on the
> installed IPsec SA.)
>
>> First, am I correctly understanding the behaviour of StrongSwan when NAT remapping occurs? Is this expected StrongSwan behaviour?
>
> Check the log for what's actually going on.
>
>> And: If the Azure end was StrongSwan, and ESP-in-UDP packets started arriving at port 500, would StrongSwan process them as ESP?
>
> No, it's not possible to process ESP-in-UDP packets on port 500 as IKE
> messages to that port are not sent with non-ESP marker and thus would
> get processed as ESP messages.
>
> Regards,
> Tobias
More information about the Users
mailing list