[strongSwan] Tunnels with dynamic IP and another route issue

Fri May 5 08:55:06 CEST 2017

Okey, so I have tried different things now and I think that Charon 
source routing seem to work better (atleast it looks like if after a 
couple of reboots of the gateway). Im not source what did it, because I 
have both added an Iptable mangling rule for UDP port 500 (IKE) marrking 
out the WAN interface I want to use for IPsec, and changed settings so 
that Charon ignores all but one routing table and also only listens to 
this interface (and LAN), not the other WAN-interfaces. I have tried to 
force it in every possible way to only do it's route lookup/next hop in 
ONE table for the right WAN-interface.

However, the more I try the more I seem to understand that Strongswan 
seems not to be good at handling a bit more advanced routing setups. My 
new issue is that my other LAN (have two local subnets/VLANs) is also 
routed out the 0.0.0.0 ipsec connection now, even though it isn't part 
of the TS-selectors. My main LAN is on br0 and my second LAN on br1. I 
have even tried creating a separate passthrough for the second LAN, but 
Charon oddly creates the passthrough route out on the WAN IP-sec 
interface (instead of br1).

It looks like the biggest problem now is the 0.0.0.0 tunnel, by default 
a default route is added to table 220. If I change charon routing table 
to main table, Charon adds two new routes (I think cover the whole 
Internet) and leaves the default main route (multipath netxhop route). 
However, because the second LAN isn't a part of the 0.0.0.0 connection 
TS-selector, it won't be allowed out this default route and tunnel (and 
that's how I want it). I would like the second LAN to use the ordinary 
default routes to reach the internet, and I'm really lost here how to 
handle Strongswan with multiple default routes and/or routing tables 
with policy routing.

I know you said that Strongswan isn't adapted to work with multipath 
routes and that sort of routing (loadbalancing), however, it seems that 
it has issues with systems with several routing tables too. Do you have 
any recommendations how I should go on from here, or is it just that 
simple that Strongswan isn't going to be working reliably in an advanced 
routing setup?

I'm pasting all of my gateways routing and policy routing setup below, 
if that is of any help.

# ip route show
90.225.194.1 dev vlan845  scope link
85.24.240.1 dev vlan847  scope link
85.24.244.1 dev vlan846  scope link
10.248.0.x dev ppp1  scope link
10.1.1.0/26 dev br0  proto kernel  scope link  src 10.1.1.1
10.1.2.0/26 dev br1  proto kernel  scope link  src 10.1.2.1
90.225.194.0/24 dev vlan845  proto kernel  scope link  src 90.225.194.x
85.24.244.0/24 dev vlan846  proto kernel  scope link  src 85.24.244.x
85.24.240.0/24 dev vlan847  proto kernel  scope link  src 85.24.240.x
127.0.0.0/8 dev lo  scope link
default
         nexthop via 90.225.194.1  dev vlan845 weight 1
         nexthop via 10.248.0.21  dev ppp1 weight 256
         nexthop via 85.24.240.1  dev vlan847 weight 1
default via 85.24.244.1 dev vlan846  metric 1

# ip rule
0:      from all lookup local
101:    from 90.225.194.x lookup WAN1
102:    from 10.248.0.x lookup WAN2
103:    from 85.24.240.x lookup WAN3
121:    from all fwmark 0x100/0xf00 lookup WAN1
122:    from all fwmark 0x200/0xf00 lookup WAN2
123:    from all fwmark 0x300/0xf00 lookup WAN3
124:    from all fwmark 0x400/0xf00 lookup WAN4
32766:  from all lookup main
32767:  from all lookup default

Strongswan isn't running at the moment, that's way route table 220 is 
missing above.

Den 2017-05-04 kl. 14:52, skrev Dusan Ilic:
> Okey, I will try some things out and see if it gets better. If not I 
> will return with some logs :)
> I'm just thinking out loud here regarding Charon source route 
> selection, because you proposed leaving out the "left"-parameter 
> (defaulting to %any I think) and my router is multihomed, what about 
> if I mangle the output packets on UDP port 500 through the right WAN 
> interface routing table? Will that force charon traffic out the right 
> interface too?
> Or maybe if I exlude all routing tables present on the gateway (except 
> the one I want) in strongswan.conf, then that shoud force Charon to do 
> source route lookups in this table only?
>
> I have made some completely different observations, I tried running 
> Strongswan with libipsec instead of kernel modules and noticed two 
> things.
>
> 1. The shunt policy doesn't work anymore, the route for local LAN gets 
> created with dev ipsec0 (instead of br0). Is this a known bug? I had 
> to add a manual route to table 220.
>
> 2. It's easier to route, maintain and so on because all traffic goes 
> in/out on a dedicated interface (ipsec0), so no need for IP-tables 
> policy matching. However, it's noticeable slower (througput) and when 
> transferring traffic my routers almost hits 100% cpu load. Is this 
> normal?
>
> With kernel modules I can reach double the througput (20 vs 50Mbps), 
> however then the CPU is only around 50%. What do you think is the 
> bottle neck here for achieving higher throughput? The remote endpoint? 
> With Android Strongswan client it's even slower than that (tested on 
> WiFi).
> Both sides of the WAN-connection in this case have 100Mbps, so that's 
> ruled out.
>
>
> Den 2017-05-03 kl. 16:23, skrev Noel Kuntze:
>>
>> On 03.05.2017 13:51, Dusan Ilic wrote:
>>> By the way, it seems the order of shunt connections do matter.
>> They don't. XFRM doesn't care about what order any policies are 
>> inserted, only the TS and the priority.
>>
>>> If I put it at the end after all other connections the network gets 
>>> completely cut off...looks like I have to put it directly after the 
>>> 0.0.0.0 <tel:0.0.0.0> connection.
>> Sounds like you have a race condition between charon and the software 
>> that gets your network connection(s) up. Make charon start after that 
>> software is done.
>> I can't tell for certain though, because you don't share the logs.
>>
>>> ---- Noel Kuntze skrev ----
>>>
>>>
>>>
>>> On 02.05.2017 17:41, Dusan Ilic wrote:
>>>> I see, thank you.
>>>>
>>>> Well, I seem to have random issues now with my new configuration.
>>>>
>>>> After restartin Strongswan sometiems it works, sometimes it don't 
>>>> Very unreliable.
>>>> Sometimes it connects with right source interface, sometimes 
>>>> sending packet: from 0.0.0.0[500] to 94.x.x.x[500] (1316 bytes) and 
>>>> this won't work obviously. Why 0.0.0.0?
>>>> When it connects from the right public WAN IP, sometimes it 
>>>> connects, sometimes just retransmittings a bunch of packets. Never 
>>>> had these problemse before, and I'm confused what's started causing 
>>>> them now.
>>>>
>>> Read your logs and compare them.
>>>>
>>>> *Regarding shunt connections, does it matter in which order they 
>>>> are put in ipsec.conf? Like at the top, or the bottom and so on?*
>>>>
>>> No.
>>> *
>>> *
>>>>
>>>> Den 2017-05-02 kl. 09:41, skrev Noel Kuntze:
>>>>> Yes, that's the reason why that happens. No, you need to start 
>>>>> using another subnet.
>>>>>
>>>>> On 02.05.2017 02:02, Dusan Ilic wrote:
>>>>>> I seem to have found the problem, it was on my local endpoint. 
>>>>>> The gateway have default IP-table rules in prerouting table 
>>>>>> dropping traffic entering any WAN-interface destined to a 
>>>>>> LAN-subnet, which I understand is normal as long as their isn't 
>>>>>> any IPsec involved :) Below exlude rule solves it.
>>>>>>
>>>>>> iptables -t mangle -I PREROUTING -d 10.1.1.0/26 -i $(nvram get 
>>>>>> wan3_ifname) -m policy --dir in --pol ipsec --proto esp -j ACCEPT
>>>>>>
>>>>>>
>>>>>> Now routing everything over IP-sec tunnel works great, but 
>>>>>> instead a new issue have risen. My VPN remote access users cannot 
>>>>>> reach the internet anymore (or the local subnet for that matter) 
>>>>>> when the gateway are routing all traffic over another 
>>>>>> IPsec-tunnel, and from the LAN I cannot ping the VPN-client 
>>>>>> (Android Strongswan) either. I'm wildly guessing this is because 
>>>>>> my VPN-clients are getting IP's from the local subnet 
>>>>>> (rightsourceip=%dhcp), the same subnet that I have to create a 
>>>>>> passthrough connection for. Is this solvable in an easy way, or 
>>>>>> am I forced put my VPN-clients on a separate subnet?
>>>>>>
>>>>>> Den 2017-05-01 kl. 14:57, skrev Noel Kuntze:
>>>>>>> I can't  help you further easily. You need to check what happens 
>>>>>>> to the packets and what actually needs to happen.
>>>>>>>
>>>>>>> On 30.04.2017 23:25, Dusan Ilic wrote:
>>>>>>>> I have added following on local router
>>>>>>>>
>>>>>>>> iptables -t nat -I POSTROUTING -s 10.1.1.0/26 -o vlan847 -m 
>>>>>>>> policy --dir out --pol ipsec --proto esp -j ACCEPT
>>>>>>>> (before it was iptables -t nat -I POSTROUTING -s 10.1.1.0/26 -d 
>>>>>>>> 192.168.1.0/24 -o vlan847 -m policy --dir out --pol ipsec 
>>>>>>>> --proto esp -j ACCEPT)
>>>>>>>>
>>>>>>>> And on remote router
>>>>>>>>
>>>>>>>> iptables -I FORWARD -s 10.1.1.0/26 -j ACCEPT
>>>>>>>> iptables -t nat -I POSTROUTING -s 10.1.1.0/26 -j MASQUERADE
>>>>>>>>
>>>>>>>> And now when the tunnel is up, internet doesnt work at all (all 
>>>>>>>> pings time out), however I can still reach the remote subnet 
>>>>>>>> 192.168.1.0. What is the best way to troubleshoot, if the error 
>>>>>>>> is on the local gateway or on the remote?
>>>>>>>>
>>>>>>>>
>>>>>>>> Den 2017-04-30 kl. 20:39, skrev 
>>>>>>>> noel.kuntze+strongswan-users-ml at thermi.consulting:
>>>>>>>>> Fix your NAT rules.
>>>>>>>>>
>>>>>>>>> Am 30. April 2017 12:28:48 MESZ schrieb Dusan Ilic 
>>>>>>>>> <dusan at comhem.se>:
>>>>>>>>>
>>>>>>>>>       Okey, so I found info about adding a "passthrough" 
>>>>>>>>> connection for my
>>>>>>>>>       local LAN. I have done this now and when i start the 
>>>>>>>>> connection the
>>>>>>>>>       network connection isn't cut off, however, it seems like 
>>>>>>>>> my internet
>>>>>>>>>       traffic i still using my local gateway (browsed to a 
>>>>>>>>> check my ip-page).
>>>>>>>>>       I can however still ping the remote network.
>>>>>>>>>
>>>>>>>>>       Here is my tabel 220
>>>>>>>>>
>>>>>>>>>       # ip route show table 220
>>>>>>>>>       10.1.1.0/26 <http://10.1.1.0/26> dev br0  proto static  
>>>>>>>>> src 10.1.1.1 <http://10.1.1.1> # LAN passthrough?
>>>>>>>>>       default via 85.24.x.x dev vlan847  proto static  src 
>>>>>>>>> 10.1.1.1 <http://10.1.1.1>
>>>>>>>>>
>>>>>>>>>       So instead of a route to 192.168.1.0/24 
>>>>>>>>> <http://192.168.1.0/24> a default route is added, but it
>>>>>>>>>       looks like it doesn't go through the tunnel... traffic 
>>>>>>>>> to 192.168.1.0/24 <http://192.168.1.0/24>
>>>>>>>>>       do get tunneled still though.
>>>>>>>>>
>>>>>>>>>       Den 2017-04-30 kl. 11:59, skrev Dusan Ilic:
>>>>>>>>>
>>>>>>>>>           Hello again, It worked with the hack! Thank you! 
>>>>>>>>> Last question (hopefully! :P)), if I would like to use the 
>>>>>>>>> remote endpoint to route *all* traffic over the vpn, is below 
>>>>>>>>> the correct way? I have changed rightsubnet locally to 
>>>>>>>>> 0.0.0.0/0 and leftsubnet remotely to 0.0.0.0/0, I have also 
>>>>>>>>> added NAT on the remote router for the local subnet on the 
>>>>>>>>> local endpoint, and finally I have added the local subnet to 
>>>>>>>>> table 220 on the local router. I have also replaced the 
>>>>>>>>> Iptable forward rule on local endpoint with 0.0.0.0/0 instead 
>>>>>>>>> of only the remote subnet. However, when I up the connection 
>>>>>>>>> on the local router in a couple of seconds my SSH connection 
>>>>>>>>> stops responding, and I cannot reach the local gateway or 
>>>>>>>>> internet any longer. I have to reboot the local router to get 
>>>>>>>>> access again. Is this familiar to you? What could be happening 
>>>>>>>>> here? Den 2017-04-29 kl. 18:44, skrev Noel Kuntze:
>>>>>>>>>
>>>>>>>>>               Hello Dusan, On 29.04.2017 18:34, Dusan Ilic wrote:
>>>>>>>>>
>>>>>>>>>                   It works! I found a hidden setting under 
>>>>>>>>> Phase 1 in Fortigate where i could add the local ID. Added 
>>>>>>>>> it's dynamic dns hostname and now it connects.
>>>>>>>>>
>>>>>>>>>               Great!
>>>>>>>>>
>>>>>>>>>                   However, I still have issues with another 
>>>>>>>>> endpoint I'm testing. My local endpoint have Strongswan 5.5.1 
>>>>>>>>> and the remote endpoint have 4.5.2. Would that present any 
>>>>>>>>> issues or incompatibilites? Unfortunately it's not possible to 
>>>>>>>>> upgrade the remote endpoint (Strongswan).
>>>>>>>>>
>>>>>>>>>               Pluto resolves IDs that are FQDNs. I think there 
>>>>>>>>> was a hack, where you add the at-character in front of the 
>>>>>>>>> FQDN in the ID settings and that stops it from doing that. 
>>>>>>>>> Might apply to charon, too in such a low version number. Try 
>>>>>>>>> the hack.
>>>>>>>>>
>>>>>>>>>                   I tried below, per your suggestion 
>>>>>>>>> left=%local.example leftid=local.example right=%remote.example 
>>>>>>>>> rightid=remote.example remote.example : PSK "PSKGOESHERE" Log 
>>>>>>>>> when local sides initiates connection: parsed IKE_AUTH 
>>>>>>>>> response 1 [ N(AUTH_FAILED) ] received AUTHENTICATION_FAILED 
>>>>>>>>> notify error
>>>>>>>>>
>>>>>>>>>               You need to read the remote logs when the remote 
>>>>>>>>> side sends you an error message.
>>>>>>>>>
>>>>>>>>>                   Log when remote side initiates connection: 
>>>>>>>>> Apr 29 16:32:20 R6250 daemon.info <http://daemon.info> charon: 
>>>>>>>>> 10[CFG] looking for peer configs matching 
>>>>>>>>> 85.24.x.x[85.24.x.x]...94.254.x.x[94.254.x.x] Apr 29 16:32:20 
>>>>>>>>> R6250 daemon.info <http://daemon.info> charon: 10[CFG] no 
>>>>>>>>> matching peer config found It looks like the same issue, the 
>>>>>>>>> remote endpoint doesnt send the configured ID?
>>>>>>>>>
>>>>>>>>>               Yes.
>>>>>>>>>
>>>>>>>>>                   And another question, when using dynamic 
>>>>>>>>> hostnames instead of IP's as "right", how often does 
>>>>>>>>> Strongswan make a new DNS-lookup? How does Strongswan handle 
>>>>>>>>> the situation where let's say the remote endpoint suddenly 
>>>>>>>>> receives a new IP? Or if the local side receives a new IP 
>>>>>>>>> during established connection?
>>>>>>>>>
>>>>>>>>>               strongSwan does a DNS lookup whenever it tries 
>>>>>>>>> to select a configuration. Well, depends on if mobike is used 
>>>>>>>>> or no and if the peer who's IP changed can't send any traffic 
>>>>>>>>> anymore. Mobike and connectivity: IKE_SA and CHILD_SAs are 
>>>>>>>>> migrated No mobike and connectivity: Don't know. Maybe a new 
>>>>>>>>> IKE_SA is negotiated, because the one peer knows the local 
>>>>>>>>> address has vanished (and the CHILD_SAs migrated?). No mobike 
>>>>>>>>> and no connectivity: Timeout, if DPD is used. Otherwise the 
>>>>>>>>> IKE_SA and CHILD_SAs remain until the remote peer connects 
>>>>>>>>> again. Mobike and no connectivity: Timeout, if DPD is used. 
>>>>>>>>> Otherwise the IKE_SA and CHILD_SAs remain until the remote 
>>>>>>>>> peer connects again. Kind regards, Noel
>>>>>>>>>
>>>>>>>>>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ 
>>>
>>>>>>>>>           Users mailing list Users at lists.strongswan.org 
>>>>>>>>> https://lists.strongswan.org/mailman/listinfo/users
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sent from mobile
>>>
>>
>
>