VPN kill switch: How to do it on Linux
2023-12-29
Kill switch is a mechanism that prohibits any outgoing traffic unless a VPN is active. In this article we discuss how to implement such a mechanism using Linux policy-based routing for a wide range of VPNs.
Table of contents
Linux IP packet routing tour
Before diving into how to implement a kill switch we need to get familiar with how Linux IP packet routing works in general. The best way to do that is to run ip route
command that shows a routing table. On my computer this command outputs the following.
# "ip route" output
default via 10.65.0.1 dev wlan0 proto dhcp src 10.65.0.11 metric 305
10.33.0.0/16 dev vpn1 scope link
10.65.0.0/20 dev wlan0 proto dhcp scope link metric 305
CIDR notation, gateways and broadcast addresses
Each row in the table is a rule that matches a particular packet destination in CIDR notation. For example, 10.33.0.0/16 matches any packet with a destination 10.33.X.X where X is arbitrary number from 0 to 255. Usually the first address 10.33.0.1 is the gateway — default packet destination if no rules match the current packet destination — and the last address 10.33.255.255 is broadcast address — if you send a packet to this address all nodes in the network will receive it. The reality is more complicated though: you can use any address as the gateway and set any address as broadcast address. Plus broadcast packets are usually only sent to the nodes that are connected to the same network switch and these packets are usually blocked by the network router to prevent accidental flooding.
How to read the routing table
Now we can go back to the table to study the rules. In the output default is another way of spelling 0.0.0.0/0, and this rule matches any packet destination.
- The first rule says «forward the packet to a gateway with address 10.65.0.1 if none of other rules match».
- The second rule says «forward the packet to network device vpn1 if the destination matches 10.33.0.0/16». In this case the device driver or a program that is attached to this device will handle the packet.
- The third rule says «forward the packet to network device wlan0 if the destination matches 10.65.0.0/16». There is no gateway in this rule because the packet's destination is in the same network as the gateway, and Linux sends the packet directly to the destination.
Is there only one routing table?
As you may have guessed there are many routing tables in the system. There is default, local and main table. Each table has the id and the name. The mapping between them is stored in /etc/iproute2/rt_tables
file. Counterintuitively the default table is main. To see the contents of other tables use the following commands.
$ ip route show table main
...
$ ip route show table local
...
$ ip route show table default
...
On my computer default table does not exist. The local table lists local and broadcast addresses associated with network devices.
$ ip route show table local
local 10.33.0.41 dev vpn1 proto kernel scope host src 10.33.0.41
local 10.65.0.11 dev wlan0 proto kernel scope host src 10.65.0.11
broadcast 10.65.15.255 dev wlan0 proto kernel scope link src 10.65.0.11
local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1
local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
Policy-based routing
Linux has another set of rules that define how the table is selected. These rules also have priorities, so if a packet matches multiple rules then the rule with lowest priority is selected. To see all the rules use ip rule
command.
$ ip rule
0: from all lookup local
32766: from all lookup main
32767: from all lookup default
On my computer each rule matches any packet (from all clause in the output), and local table has lower priority than main. This is what we will leverage to create a VPN kill switch.
VPN kill switch with policy-based routing
VPN kill switch requires routing all outgoing traffic through a VPN except for the local traffic and VPN internal traffic. This means that if a VPN uses port 1234, then the traffic from this port should go through the default gateway or directly to the node in the local network. To implement that we will create a separate table and a rule that uses this table for all non-VPN and non-local packets.
Custom routing table
First edit /etc/iproute2/rt_tables
file and add the following line that defines our new table.
83 vpn1
Now create rules in the new table. The table itself is created automatically.
# remove existing rules if any
$ ip route flush table vpn1
# add default route via gateway node from VPN network
$ ip route add default dev vpn1 via 10.83.0.1 table vpn1 metric 100
# add blackhole route (this is the actual kill switch)
$ ip route add blackhole default table vpn1 metric 200
# check that the rule has been added
$ ip route show table vpn1
default via 10.83.0.1 dev vpn1 metric 100
blackhole default metric 200
To summarize, we added new routing table called vpn1, we added default route via gateway node from VPN and we added so-called black hole route. Default route is preferred over black hole route because of the lower metric. The black hole route is used only when the default is not present in the table. Device vpn1 is automatically deleted whenever VPN is stopped and the corresponding rules are deleted as well, however, black hole route stays intact.
Custom routing rules
Now we will link vpn1 to the main routing table.
# route all packets except the ones from source port 1234 using the rules from table vpn1
$ ip rule add not sport 1234 table vpn1
# prefer specific rules in table "main" over the rules in other tables
$ ip rule add table main suppress_prefixlength 0
# check the rules
$ ip rule
0: from all lookup local
32764: from all lookup main suppress_prefixlength 0
32765: not from all sport 9376 lookup vpn1
32766: from all lookup main
32767: from all lookup default
The first rule is self-explanatory, you can check out all possible alternatives to not sport in the documentation. According to the documentation suppress_prefixlength N option means «reject routing decisions that have a prefix length of N or less». Prefix length equals zero means default route, hence this rule means «reject routing decisions that match default route in table main». So, suppress_prefixlength 0 is a fancy way of saying «ignore the default route from the main routing table». Since the next table in the list is vpn1, then all the traffic except for local networks will go through the vpn1 network interface.
Any alternatives?
We tested policy-based VPN kill switch with Wireguard (do not forget to specify the port in the configuration) and Staex. Both VPNs use only one port for their internal traffic. It should be possible to match the traffic of a centralized VPN by source/destination in CIDR notation (from and to options of ip rule
command). In general the exact packets can be marked using iptables
and then matched by the same mark in the routing rules (see mark
iptables
module). This article discusses various approaches within the context of Wireguard.
Multiple VPNs
The nature of a kill switch does not play well with multiple VPNs. Probably the only way to exclude multiple ports from the default route is to use firewall marks. We have not evaluated this approach yet.
Conclusion
We conceived kill switch to be a simple VPN feature, however, we underestimated the complexity of Linux networking. Linux has multiple layers of IP packet routing rules, built-in firewall and network namespaces. VPNs do not make this task simpler either: they might use several ports for the internal communication or you might want to run multiple VPNs on a single node.
Staex is a secure public network for IoT devices that can not run a VPN such as smart meters, IP cameras, and EV chargers. Staex encrypts legacy protocols, reduces mobile data usage, and simplifies building networks with complex topologies through its unique multi-hop architecture. Staex is fully zero-trust meaning that no traffic is allowed unless specified by the device owner which makes it more secure than even some private networks. With this, Staex creates an additional separation layer to provide more security for IoT devices on the Internet, also protecting other Internet services from DDoS attacks that are usually executed on millions of IoT machines.
To stay up to date subscribe to our newsletter, follow us on LinkedIn and Twitter for updates and subscribe to our YouTube channel.
See also
Staex: Data Sharing for IoT
2024-06-17
In this article, we want to share how we achieved Web3 IoT data infrastructure utilizing Staex and PEAQ networks.
Staex latest release features tunnels as the ultimate network isolation tool
2024-06-04
The tunnels force network traffic to go through them. Any network packets that try to bypass tunnels are dropped. If no tunnels are defined, no network traffic is allowed.
Public network for IoT devices
2024-02-23
Staex public network is a zero trust network that is the backbone for the today's’ demand of the Internet of Things. In this article we discuss why we are creating such a network and how it can be useful to anyone dealing with IoT devices.