[strongSwan] Overriding DF on XFRM interfaces

John Marrett johnf at zioncluster.ca
Fri Dec 3 15:35:34 CET 2021


I am working on a VPN solution connecting some appliances on two
different networks. I’m using an x86 openwrt router with strongswan
5.9.2 and kernel 5.4.154. The systems I am connecting exhibit
non-compliant TCP MSS behaviour. They are, for unknown reasons,
ignoring the MSS from their peers and sending oversized packets. They
also ignore ICMP unreachable messages indicating path MTU, I have
confirmed that the ICMP unreachable messages are not blocked and they
have been captured directly on the system sending the problematic
traffic. I do not have control over the appliances and need to solve
the issues at the network level.

I'm using a modern IKEv2 / XFRM based configuration for this VPN. I
would like to ignore the DF bit and fragment traffic passing through
the VPN tunnel. This fragmentation could occur before or after
encapsulation, it's not significant to me.

If I was using a GRE tunnel I could use the ignore-df configuration
[1], however there doesn't appear to be an equivalent with an xfrm
interface.

I have managed to "solve" my problem, though I do not understand the
solution or how it works. If I create the following iptables rule to
adjust the MSS on traffic traversing the xfrm interface:

iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -o xfrm0
-j TCPMSS --set-mss 1240
iptables -t mangle -A FORWARD -p tcp --tcp-flags SYN,RST SYN -i xfrm0
-j TCPMSS --set-mss 1240

Then, in addition to the expected modification of the mss field, my
TCP traffic will be fragmented, ignoring the DF bit.

Here's an excerpt of traffic in ingress to the router:

09:23:56.103022 IP 10.1.34.10.5060 > 10.1.61.20.25578: Flags [P.], seq
883:1906, ack 1760, win 260, length 1023
09:23:56.119864 IP 10.1.61.20.25578 > 10.1.34.10.5060: Flags [.], ack
1906, win 501, length 0
09:24:01.448960 IP 10.1.34.10.5060 > 10.1.61.20.25578: Flags [P.], seq
1906:3271, ack 1760, win 260, length 1365
09:24:01.467771 IP 10.1.61.20.25578 > 10.1.34.10.5060: Flags [.], ack
3148, win 501, length 0
09:24:01.467810 IP 10.1.61.20.25578 > 10.1.34.10.5060: Flags [.], ack
3271, win 501, length 0

And egress on the xfrm interface (In addition to being sent over a VPN
connect the traffic is also being NATed by the VPN router):

09:23:56.103150 IP 10.2.30.1.5060 > 10.2.2.6.25578: Flags [P.], seq
881:1902, ack 1750, win 260, length 1021
09:23:56.119828 IP 10.2.2.6.25578 > 10.2.30.1.5060: Flags [.], ack
1902, win 501, length 0
09:24:01.449067 IP 10.2.30.1.5060 > 10.2.2.6.25578: Flags [.], seq
1902:3142, ack 1750, win 260, length 1240
09:24:01.449135 IP 10.2.30.1.5060 > 10.2.2.6.25578: Flags [P.], seq
3142:3265, ack 1750, win 260, length 123
09:24:01.467724 IP 10.2.2.6.25578 > 10.2.30.1.5060: Flags [.], ack
3142, win 501, length 0
09:24:01.467725 IP 10.2.2.6.25578 > 10.2.30.1.5060: Flags [.], ack
3265, win 501, length 0

The packet with length 1365 has been split into a packet of 1240 bytes
and a second of 123.

Without these rules I see the expected behaviour, the packets are
dropped and ICMP unreachable messages are sent indicating the path
MTU.

Is anyone able to explain why, in addition to adjusting the MSS, this
mangle configuration is allowing fragmentation ignoring the DF bit?
While the solution is working as I need it to, I'm concerned that it
may be extremely fragile.

Is there a better way to solve this problem?

Thanks in advance for any help you can offer,

-JohnF

[1] https://man7.org/linux/man-pages/man8/ip-tunnel.8.html


More information about the Users mailing list