3

I am trying to setup an OpenSwan(2.6.32) on CentOS 6.5 (final) to connect the remote VPC gateway on Amazon cloud. I got the tunnel up. However, only the traffic from/to the last ip range defined in leftsubnets is routed. The first one works for a brief second (maybe before the second tunnel was up), then no more routing. Below is my configuration.

conn aws-vpc leftsubnets={10.43.4.0/24 10.43.6.0/24} rightsubnet=10.43.7.0/24 auto=start left=206.191.2.xxx right=72.21.209.xxx rightid=72.21.209.xxx leftid=206.191.2.xxx leftsourceip=10.43.6.128 authby=secret ike=aes128-sha1;modp1024 phase2=esp phase2alg=aes128-sha1;modp1024 aggrmode=no ikelifetime=8h salifetime=1h dpddelay=10 dpdtimeout=40 dpdaction=restart type=tunnel forceencaps=yes 

After start IPsec service:

# service ipsec status IPsec running - pluto pid: 8601 pluto pid 8601 2 tunnels up some eroutes exist # ip xfrm policy src 10.43.6.0/24 dst 10.43.7.0/24 dir out priority 2344 ptype main tmpl src 206.191.2.xxx dst 72.21.209.xxx proto esp reqid 16389 mode tunnel src 10.43.7.0/24 dst 10.43.6.0/24 dir fwd priority 2344 ptype main tmpl src 72.21.209.xxx dst 206.191.2.xxx proto esp reqid 16389 mode tunnel src 10.43.7.0/24 dst 10.43.6.0/24 dir in priority 2344 ptype main tmpl src 72.21.209.xxx dst 206.191.2.xxx proto esp reqid 16389 mode tunnel src 10.43.4.0/24 dst 10.43.7.0/24 dir out priority 2344 ptype main tmpl src 206.191.2.xxx dst 72.21.209.xxx proto esp reqid 16385 mode tunnel src 10.43.7.0/24 dst 10.43.4.0/24 dir fwd priority 2344 ptype main tmpl src 72.21.209.xxx dst 206.191.2.xxx proto esp reqid 16385 mode tunnel src 10.43.7.0/24 dst 10.43.4.0/24 dir in priority 2344 ptype main tmpl src 72.21.209.xxx dst 206.191.2.xxx proto esp reqid 16385 mode tunnel 

I don't think firewall plays any role here, as I turned it off entirely just to test out the connections. routes are working as expected too. If I define single network on the left side, individually on a separated test connection, I can reach either subnets. Only when I define leftsubets, then, whichever range comes last will get routed in the end. Whichever comes first, works for a brief second before it stopped routing.

I could not find anyone on the internet have the similar problem... can someone please enlighten me?

cheers,

bo

3 Answers 3

5

When you use leftsubnets, you have to use rightsubnets, not rightsubnet. As stated on http://linux.die.net/man/5/ipsec.conf:

If both a leftsubnets= and rightsubnets= is defined, all combinations of subnet tunnels will be instantiated.

4

This is due to a fault in the way AWS's implementation of IPSec handles SPIs (Security Parameters Indices). You can read about it in detail on libreswan's web site, but the upshot is that libreswan deals with the two ranges by establishing two tunnels (in your case, likely aws-vpc/1x1 and aws-vpc/1x2). OpenSWAN and StrongSWAN do likewise.

Each of these tunnels has its own SA (security association), each identified by a pair of SPIs, one for traffic you send (your SPI), and one for traffic Amazon sends (their SPI). Amazon, despite having established their SPI #1 for whichever tunnel comes up first, replaces it with SPI #2 when the second tunnel comes up (instead of keeping SPI #1 for tunnel one, and using SPI #2 just for tunnel two, as it should). Traffic is sent to AWS down tunnel one using your SPI #1, but Amazon encrypts the replies with their SPI #2, which naturally causes the traffic to fail to decrypt at your end.

That is why the first tunnel works only for a very brief period, until tunnel two comes up. If at some later time you force at your end the regeneration of SPIs for tunnel one, it will start working, but Amazon's new SPI #1 will replace their old SPI for tunnel two, and tunnel two will stop working just as tunnel one resumes service.

I've run into this on two separate occasions some years apart, most recently yesterday, so I don't think AWS are likely to fix it. It doesn't seem to affect commercial IPSec implementations (or AWS would have fixed it by now), I'm guessing because they don't really have the concept of tunnels between subnets but just aggregate a bunch of host-host tunnels all sharing the same SPIs. That is, however, only a guess.

Edit: weirdly, thanks to spending the intervening week working on this for a client who had a good AWS support contract, I have now confirmed what libreswan had to say about the latest SPI incorrectly replacing any earlier-established ones. Amazon also confirmed that they're doing this, and that one vpn- entity can only, to their mind, support one pair of SPIs. Their advice is to configure S/WAN so that only one tunnel is created, then to route traffic to particular destinations over it.

Fortunately, libreswan now supports this, in version 3.18 or later, provided you have a reasonably-recent Linux kernel. I can confirm that CentOS 7 satisfies on both counts.

Their detailed writeup is on their wiki, but the upshot is that you establish a tunnel with very wide source and destination ranges (0.0.0.0/0) using the Linux Virtual Tunnel Interface (vti) device, then tell libreswan not to set up routing across it (vti-routing=no). You can then choose which destinations to reach over this tunnel with simple route statements (ip route add 10.0.0.0/8 dev vti01).

I have this working in production. It even supports multiple simultaneous tunnels, later ones using different mark= and vti-interface= configuration options. Amazon also now supports associating a VPN with a transit gateway (TGW), to which many VPNs in the same AWS region can in turn be associated, so you really only need one VPN per AWS region, which is scalable.

1

Try using:

leftsubnets={10.43.4.0/24,10.43.6.0/24,} 

instead of:

leftsubnets={10.43.4.0/24 10.43.6.0/24} 

Note: Add two commas. After first and last too.

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.