How does Linux route traffic to Wireguard's peers?

Question

Important note: I'm speaking about regular and expected behavior of WireGuard on any Linux system. I don't mess with its configuration, routing rules or routing tables: all changes are made by wg-quick. The default routing table has nothing related to WG and only defines the default route:

$ ip route show default via 192.168.2.1 dev eth0 proto static metric 100 192.168.2.0/24 dev eth0 proto kernel scope link src 192.168.2.2 metric 100

Let's say I've got a WireGuard interface wg0 (configured with wg-quick) for which a 100.100.100.100 peer with allowed-ips 0.0.0.0/0 is defined. Routing rules created by WireGuard direct to use routing table 100500 for all unmarked traffic:

$ ip rule show 0: from all lookup local 32765: not from all fwmark 0x1234 lookup 100500 32766: from all lookup main 32767: from all lookup default

Routing table #100500 defines the default route through the WG tunnel:

$ ip route show table 100500 default dev wg0 scope link 192.168.0.1 dev wg0 scope link

Outgoing traffic to 100.100.100.100 is not marked and should be sent through the wg0.

Moreover, if I execute

ip route get 100.100.100.100

I see something like:

100.100.100.100 dev wg0 table 100500 src 192.168.0.2

Which means that the traffic for the peer itself should be sent through the tunnel, which obviously is not true because in such case connection with the peer would've been lost.

How does traffic for 100.100.100.100 happen to find its way while all other traffic goes through the wg0 interface?

Note: This question is really tricky because WireGuard can only mark packets that go through its interface ("wg0"). And that's what WG's manuals say: only encapsulated traffic is being marked. But traffic to the peers has never been encapsulated. And any way, outgoing packets will come to interface "wg0" after the routing decision has been made! And even if they are marked in WG's interfaces, they will be sent though the same interface and will never reach the peer! As much as I know, the observed behavior requires additional firewall rules on "pre-routing" phase. But WireGuard (wg-quick) doesn't create any additional rules in nft/iptables.

I did. Add I found there nothing related to the peers. According to the routes all traffic except marked must be sent though WireGuard interface. But outgoing traffic to a peer is not marked, so it also should have been sent there. but it obviously is not. As I specified in the question, if you do "ip route get <PEER_ADDR>" you'll see that it must be sent through the tunnel. — Sap
– Sap, Commented Sep 25, 2024 at 8:58
Start with ``ip route show`. Please edit your Question to add new information, properly formatted. Information added via comments is hard for you to format, hard for us to read and ignored by both current and future readers (who have better answers). Help us help you. — waltinator
– waltinator, Commented Sep 25, 2024 at 15:57
I am speaking about regular WireGuard behavior in general on any Linux system. That's how WG works by default, and all routes and rules are created my wg-quick. It doesn't change the default routing table by any way. There is nothing related to WG there. — Sap
– Sap, Commented Sep 25, 2024 at 20:32

Justin Ludwig · Accepted Answer · 2024-09-27 01:25:32Z

Outgoing traffic to 100.100.100.100 is not marked and should be sent through wg0.

No, if your WireGuard config file looks like this:

[Interface] FwMark = 0x1234 ... [Peer] Endpoint = 100.100.100.100:51820 AllowedIPs = 0.0.0.0/0

Then outgoing traffic sent by WireGuard to 100.100.100.100 will be marked with 0x1234 -- and therefore it will use the appropriate route from the main table.

Run this command to get the routing decision when the mark is applied:

ip route get 100.100.100.100 mark 0x1234

The Understanding modern Linux routing (and wg-quick) article provides a good explanation for how wg-quick overrides the default route on Linux. Also see the Routing All Your Traffic section of the Routing & Network Namespace Integration page on the WireGuard website.

Edit: "Regular" WireGuard doesn't involve messing with the default route -- a lot of special behavior in wg-quick is triggered when you configure WireGuard to route "everything" through it by adding a /0 network address to a peer's AllowedIPs setting. If you're trying to understand how WireGuard works, try starting out with a simple point-to-point connection.

If you use wg-quick to start up a point-to-point WireGuard connection configured like this:

# /etc/wireguard/wg0.conf [Interface] PrivateKey = ... Address = 192.168.0.2/32 [Peer] PublicKey = ... Endpoint = 100.100.100.100:51820 AllowedIPs = 192.168.0.1/32

Wg-quick will not add any special routing policy rules or packet marking; it will just add one regular route to your main routing table. The main routing table will now look something like this:

$ ip route show default via 192.168.2.1 dev eth0 proto static metric 100 192.168.0.1 dev wg0 scope link 192.168.2.0/24 dev eth0 proto kernel scope link src 192.168.2.2 metric 100

If you then ping 192.168.0.1, this is what will happen:

Ping will use the host's network stack to send an ICMP packet to 192.168.0.1.
The network stack will make a routing decision about the ICMP packet: based on the route for 192.168.0.1 above, it will add the ICMP packet to the transmission queue for the virtual wg0 interface.
The WireGuard driver will pull the ICMP packet out of the queue, and encapsulate it inside a brand new UDP packet.
WireGuard will use the host's network stack to send this new UDP packet to 100.100.100.100:51820.
The network stack will make a routing decision for this new UDP packet: and based on the default route above, it will add the UDP packet to the transmission queue for the physical eth0 interface.
The Ethernet driver will encapsulate this UDP packet inside an Ethernet frame and send it out its link.

See the WireGuard Endpoints and IP Addresses article for a full end-to-end packet trace under a similar scenario.

If the wg0 interface is configured to mark packets -- which wg-quick will set up for you automatically when it encounters a /0 network address in an AllowedIPs setting, even if you don't explicitly include a FwMark setting -- the WireGuard driver will mark the encapsulating UDP packet (not the original ICMP packet) in step 4 above.

Now if you change this simple point-to-point scenario to instead route "everything" through WireGuard, using a config like this:

# /etc/wireguard/wg0.conf [Interface] PrivateKey = ... Address = 192.168.0.2/32 [Peer] PublicKey = ... Endpoint = 100.100.100.100:51820 AllowedIPs = 0.0.0.0/0

Wg-quick will add two special policy routing rules (32764 and 32765):

$ ip rule show 0: from all lookup local 32764: from all lookup main suppress_prefixlength 0 32765: not from all fwmark 0xca6c lookup 51820 32766: from all lookup main 32767: from all lookup default

And set up a custom routing table (51820):

$ ip route show table 51820 default dev wg0 scope link

If you try to ping 100.100.100.100 under this configuration, the same six steps as above will happen:

Ping will use the host's network stack to send an ICMP packet to 100.100.100.100.
The network stack will make a routing decision about the ICMP packet -- which because it is not marked (and does not match a route with a prefix length greater than /0 in the main table), will use the 51820 table: so based on the default route for the 51820 table, the network stack will add the ICMP packet to the transmission queue for the virtual wg0 interface.
The WireGuard driver will pull the ICMP packet out of the queue, and encapsulate it inside a brand new UDP packet.
WireGuard will use the host's network stack to send this new UDP packet to 100.100.100.100:51820 -- sending it with a mark of 0xca6c.
The network stack will make a routing decision for this new UDP packet -- which because it is marked, will use the main table: so based on the default route for the main table, the network stack will add the UDP packet to the transmission queue for the physical eth0 interface.
The Ethernet driver will encapsulate this UDP packet inside an Ethernet frame and send it out its link.

Edit 2: Regarding how the packet marking is done, this is how the WireGuard kernel driver does it (directly updating the packet sk_buff struct before sending a new packet off for processing by the rest of the net stack):

https://git.zx2c4.com/wireguard-linux/tree/drivers/net/wireguard/socket.c#n36

And this is how the wireguard-go driver does it (using a helper for the libc setsockopt function when it sets up the socket for a peer connection):

https://git.zx2c4.com/wireguard-go/tree/conn/mark_unix.go#n40

The libc setsockopt function (setting the SO_MARK option) is generally how user-space programs can set the packet mark on packets they generate. This does require the program to be granted either the CAP_NET_ADMIN or the CAP_NET_RAW capability, however.

"Then outgoing traffic sent by WireGuard to 100.100.100.100 will be marked with 0x1234" How?! WG can only mark traffic that is CREATED by its interface. Outgoing traffic to 100.100.100.100 cannot be marked other way than with very dirty kernel hacks which are incredibly bad practice and breaks the whole internal kernel's packet routing scheme. That means that WireGuard doesn't use regular sockets inside itself and its packets do not follow the regular way inside the kernel. — Sap
– Sap, Commented Sep 25, 2024 at 20:26
And by your link it says that: " 0xca6c is just a numerical label (“firewall mark”) that wg-quick asked wg to mark all of the packets that it emits. These are packets that already encapsulate other packets and are targeted to your VPN peer/server. " Which means that marked traffic is encapsulated in the tunnel (and created by WG's interface) which is normal and expected behavior. But traffic to the peers is not encapsulated and cannot be marked! — Sap
– Sap, Commented Sep 25, 2024 at 20:28
But traffic to the peers is not encapsulated Are you talking about like traffics caused by ping 100.100.100.100 and curl 100.100.100.100? Or are you talking about like WG-encapsulated traffics caused by attempted communication with e.g Google? (i.e. the destination "inside" is Google while that "outside" is 100.100.100.100) — Tom Yan
– Tom Yan, Commented Sep 26, 2024 at 8:04
If WireGuard uses any kind of any standard Linux sockets, then there is no difference between traffic sent with ping or by WG. There is no way to mark outgoing traffic. The traffic can be marked only with firewall rules or when it passes (or it created by) an interface with a custom driver. Actually, that is the mystery: how WireGuard marks the traffic?I suppose that traffic to peers is not marked initially, so it's been routed to WG interface. Then, when the traffic comes to its interface. WG marks it and somehow redirects to rerouting. But this way has numerous side effects. — Sap
– Sap, Commented Sep 26, 2024 at 19:31
I am looking for someone who really knows how WG works because I need to implement a pretty difficult routing scheme. And since there is "wireguard-go", which cannot use the most dirty kernel hacks, I am sure that the WG has no easy way to mark traffic being sent to peers. There is just no such API in Linux! — Sap
– Sap, Commented Sep 26, 2024 at 19:35

grawity · Accepted Answer · 2024-09-27 05:44:04Z

To start with:

If WireGuard uses any kind of any standard Linux sockets, then there is no difference between traffic sent with ping or by WG. There is no way to mark outgoing traffic. The traffic can be marked only with firewall rules or when it passes (or it created by) an interface with a custom driver

This is not true. Programs can request a specific mark for outgoing traffic from a socket they have created, by calling setsockopt(SO_MARK) as documented in socket(7). This is available to any sufficiently privileged userspace software (including wireguard-go and indeed also ping which has the -m <mark> option), which implies that it is also available to the WireGuard kernel driver.

If we search the source code of wireguard-go for the term 'mark' we'll find (2, 3):

 // SetMark sets the mark for each packet sent through this Bind. // This mark is passed to the kernel as the socket option SO_MARK. SetMark(mark uint32) error

 switch runtime.GOOS { case "linux", "android": fwmarkIoctl = 36 /* unix.SO_MARK */ case "freebsd": fwmarkIoctl = 0x1015 /* unix.SO_USER_COOKIE */ case "openbsd": fwmarkIoctl = 0x1021 /* unix.SO_RTABLE */ } [...] operr = unix.SetsockoptInt(int(fd), unix.SOL_SOCKET, fwmarkIoctl, int(mark))

Similar usage of SO_MARK can be found in iputils-ping (5), which exposes it via -m <mark>.

void sock_setmark(struct ping_rts *rts, int fd) { #ifdef SO_MARK [...] enable_capability_admin(); ret = setsockopt(fd, SOL_SOCKET, SO_MARK, &(rts->mark), sizeof(rts->mark)); [...] }

And if we check drivers/net/wireguard/socket.c within the Linux source, for the equivalent functionality in the kernel-mode WireGuard interface, we'll find (4):

static int send4(struct wg_device *wg, struct sk_buff *skb, struct endpoint *endpoint, u8 ds, struct dst_cache *cache) { struct flowi4 fl = { .saddr = endpoint->src4.s_addr, .daddr = endpoint->addr4.sin_addr.s_addr, .fl4_dport = endpoint->addr4.sin_port, .flowi4_mark = wg->fwmark, .flowi4_proto = IPPROTO_UDP }; struct rtable *rt = NULL; struct sock *sock; int ret = 0; skb_mark_not_on_list(skb); skb->dev = wg->dev; skb->mark = wg->fwmark;

…which indicates that 1) each WireGuard instance has a fwmark property, and that 2) the route lookup for packets sent by that WireGuard instance is done with that fwmark as one of the criteria, and that 3) the same fwmark is then assigned to the actual socket buffer.

Therefore, WireGuard can set a mark on the "outer" UDP traffic that it sends towards its peers, and in order to accurately simulate its route lookup, you must get the mark from wg show wg0 (wg-quick assigns one by default) and must specify it as fwmark XXX in your "ip route get" command.

This means that:

Your regular unmarked packets match the not ... fwmark 0x1234 policy-routing rule and select table 100500, which causes them to enter wg0.
WireGuard-encapsulated packets are marked and don't match the not ... fwmark 0x1234 rule, which prevents them from entering wg0 and causes them to be routed through 192.168.2.1 per the main table instead.

Thank you a lot!

Sap
– Sap

2024-09-27 21:56:00 +00:00
Commented Sep 27, 2024 at 21:56 — Sap
– Sap, Commented Sep 27, 2024 at 21:56

Stack Exchange Network

How does Linux route traffic to Wireguard's peers?

2 Answers 2

You must log in to answer this question.

Hot Network Questions

How does Linux route traffic to Wireguard's peers?

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions