1. Overview
Understanding how networking works is crucial for managing the Linux operating system. It’s a core component that almost every admin needs to interact with.
In this tutorial, we’re going to cover an important concept in Linux networking which is routing. We’ll explain how routing works and what are the different parts that influence the routing decision. We’ll also discuss how we can manipulate the routing decision manually to route traffic for a specific destination over a specific network interface.
2. Linux Routing Table
The routing table is a component of the Linux operating system that stores information about how to reach specific subnets on the network. We can think of it as a database that Linux checks to know how to forward network traffic to a destination.
The routing table contains entries called routes, and each route describes how to reach a subnet. Routes have two main parts, a destination subnet and an exit interface.
The destination subnet is the target we want to reach on the network. It’s represented in the routing table with its IP CIDR range. For example, a destination subnet in the routing table appears as 10.1.1.0/24.
The exit interface is the network interface on the Linux machine that the machine uses to send the traffic to the destination subnet.
So let’s check this with an example:
$ ip route
default via 172.25.0.1 dev eth1
172.25.0.0/24 dev eth1 proto kernel scope link src 172.25.0.47
192.26.6.0/24 dev eth0 proto kernel scope link src 192.26.6.6
Here we used the ip route command to view the routing table. We can see three route entries, one for subnet 172.25.0.0/24, another for subnet 192.26.6.0/24, and another for default.
The default entry is what we call the default route of the machine. It acts like a catch-all rule for destinations that don’t match any of the other subnets in the routing table.
So here we can see that for each entry, we have a dev part in the output. This is the exit interface (device) that the machine uses to send traffic to the corresponding subnet. For example, to send traffic to subnet 172.25.0.0/24, the machine will forward it through dev eth1.
3. How the Routing Table Gets Populated
Each route is added to the routing table in one of three ways, whether dynamic, static, or directly connected networks.
The dynamic method is uncommon on endpoint user machines like Linux or Windows. It’s more common on networking devices like routers that interconnect many different subnets. It needs a particular software called a routing protocol that enables the automatic exchange of subnets. So on our Linux machines, we mainly rely on static and directly connected routes.
Directly connected routes are automatically inserted by the kernel based on the IP address information assigned to the network interfaces.
Let’s check this with an example:
$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 192.29.37.8 netmask 255.255.255.0 broadcast 192.29.37.255
ether 02:42:c0:1d:25:08 txqueuelen 0 (Ethernet)
RX packets 525 bytes 64969 (64.9 KB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 434 bytes 166903 (166.9 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.25.0.53 netmask 255.255.255.0 broadcast 172.25.0.255
ether 02:42:ac:19:00:35 txqueuelen 0 (Ethernet)
RX packets 2071 bytes 32211177 (32.2 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1675 bytes 120767 (120.7 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Here we have two interfaces on our machine with IP addresses 192.29.37.8 and 172.25.0.53, and both have a subnet mask of 255.255.255.0. The kernel understands from this information which subnets these interfaces belong to. In other words, it calculates from the IP and netmask the CIDR of the subnets:
$ ip route
default via 172.25.0.1 dev eth1
172.25.0.0/24 dev eth1 proto kernel scope link src 172.25.0.53
192.29.37.0/24 dev eth0 proto kernel scope link src 192.29.37.8
We can see here two subnets, 172.25.0.0/24 and 192.29.37.0/24, that correspond to the IP address of each interface. The output also shows a proto kernel part in each line. This indicates that the route table learned this route through the kernel because it is directly connected to the machine interfaces.
Static routes, on the other hand, are entries that we insert manually into the routing table. We use the ip route add command and provide the destination subnet and the exit interface:
$ ip route add 10.10.10.0/24 dev eth1
Now if we check our routing table:
$ ip route
default via 172.25.0.1 dev eth1
10.10.10.0/24 dev eth1 scope link
172.25.0.0/24 dev eth1 proto kernel scope link src 172.25.0.53
192.29.37.0/24 dev eth0 proto kernel scope link src 192.29.37.8
So here, we’ve added a static route for subnet 10.10.10.0/24 through the exit interface dev eth1, and we can see this information in our routing table.
4. How Routing Decisions Work
When the Linux machine sends traffic over the network, the routing table checks the destination IP address in the packet and compares it with the route entries. When it finds a matching subnet, it sends the traffic out of the exit interface for this subnet.
Now let’s understand how the routing table matches the IP address with a specific subnet.
The routing table takes the destination IP address and applies the subnet’s netmask to it. When the netmask is applied to the IP address, it results in a network address. It then compares this network address with the CIDR of the subnet. If they match, this means that the IP address belongs to this subnet, and it forwards the packet out of the exit interface.
Let’s check this with an example:
$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 192.34.213.9 netmask 255.255.255.0 broadcast 192.34.213.255
ether 02:42:c0:22:d5:09 txqueuelen 0 (Ethernet)
RX packets 108683 bytes 7278791 (7.2 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 223903 bytes 71126728 (71.1 MB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 172.25.0.77 netmask 255.255.255.0 broadcast 172.25.0.255
ether 02:42:ac:19:00:4d txqueuelen 0 (Ethernet)
RX packets 5919 bytes 60360883 (60.3 MB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5546 bytes 385276 (385.2 KB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Here we have two interfaces with IP addresses 192.34.213.9 and 172.25.0.77. Let’s now check our routing table:
$ ip route
172.25.0.0/24 dev eth1 proto kernel scope link src 172.25.0.77
192.34.213.0/24 dev eth0 proto kernel scope link src 192.34.213.9
We can see our routing table with two entries corresponding to our interfaces. Let’s try to send some traffic with the ping command:
$ ping 192.34.213.11
PING 192.34.213.11 (192.34.213.11) 56(84) bytes of data.
64 bytes from 192.34.213.11: icmp_seq=1 ttl=64 time=0.292 ms
64 bytes from 192.34.213.11: icmp_seq=2 ttl=64 time=0.219 ms
64 bytes from 192.34.213.11: icmp_seq=3 ttl=64 time=0.243 ms
64 bytes from 192.34.213.11: icmp_seq=4 ttl=64 time=0.235 ms
--- 192.34.213.11 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3053ms
rtt min/avg/max/mdev = 0.219/0.247/0.292/0.027 ms
Now we’ve sent some traffic to destination IP 192.34.213.11, and we received a successful reply. If we apply the netmask rule against the two routes in our routing table, we’ll find that this IP matches the route 192.34.213.0/24. This means that this traffic was sent from the dev eth0 interface.
Let’s verify this by using the tcpdump command:
$ tcpdump -n -i eth0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:09:24.958525 IP 192.34.213.9 > 192.34.213.11: ICMP echo request, id 46696, seq 28, length 64
15:09:24.958805 IP 192.34.213.11 > 192.34.213.9: ICMP echo reply, id 46696, seq 28, length 64
15:09:25.982514 IP 192.34.213.9 > 192.34.213.11: ICMP echo request, id 46696, seq 29, length 64
15:09:25.982729 IP 192.34.213.11 > 192.34.213.9: ICMP echo reply, id 46696, seq 29, length 64
15:09:27.006515 IP 192.34.213.9 > 192.34.213.11: ICMP echo request, id 46696, seq 30, length 64
15:09:27.006754 IP 192.34.213.11 > 192.34.213.9: ICMP echo reply, id 46696, seq 30, length 64
The tcpdump command can monitor traffic on a specific network interface. So here, we used the command to monitor traffic on the eth0 interface. We can see that our ping traffic is going through this interface.
5. Longest Matching Prefix Rule
Sometimes we might find a situation where the destination IP of the traffic matches multiple routes. In other words, the prefixes of the subnets overlap with each other.
For example, we can have subnet 192.168.1.0/24 and subnet 192.168.1.128/26 in our routing table. So now, if we have some traffic with a destination IP 192.168.1.132, it will match both subnets. In this situation, the routing decision chooses the subnet with the longest prefix, and this is known as the longest matching prefix rule. So in this scenario, the machine will forward the traffic to the 192.168.1.128 subnet because it has a prefix of /26.
6. Route Traffic for a Specific IP Over a Specific Network Interface
We can use the longest matching prefix rule to force traffic for a specific IP to go over a specific network interface. We can do this by adding a static route to our IP with a prefix of /32 using the exit interface we want. The /32 prefix refers to a specific single IP address, and because it is the longest prefix, it will take precedence in the routing decision.
Let’s check this with an example:
$ ip route
192.41.134.0/24 dev eth0 proto kernel scope link src 192.41.134.11
192.41.134.8/29 dev eth1 proto kernel scope link src 192.41.134.10
So here we have two routes for subnets 192.41.134.0/24 and 192.41.134.8/29. The first subnet uses eth0 interface, and the second subnet uses eth1 interface.
There’s a destination IP address 192.41.134.13 on the network that is reachable over interface eth0. Let’s try to ping this destination IP:
$ ping 192.41.134.13
PING 192.41.134.13 (192.41.134.13) 56(84) bytes of data.
From 192.41.134.10 icmp_seq=1 Destination Host Unreachable
From 192.41.134.10 icmp_seq=2 Destination Host Unreachable
From 192.41.134.10 icmp_seq=3 Destination Host Unreachable
From 192.41.134.10 icmp_seq=4 Destination Host Unreachable
From 192.41.134.10 icmp_seq=5 Destination Host Unreachable
From 192.41.134.10 icmp_seq=6 Destination Host Unreachable
We can see here that our ping fails continuously. The problem is that eth1 has an overlapping subnet with eth0, but the subnet on eth1 has a longer prefix, so our traffic was forwarded from eth1 instead of eth0.
Let’s add a specific route to our destination IP from over eth0:
$ ip route add 192.41.134.13/32 dev eth0
So here, we’ve added a specific route to 192.41.134.13 but with a longer prefix of /32.
Let’s check our routing table now:
$ ip route
192.41.134.0/24 dev eth0 proto kernel scope link src 192.41.134.11
192.41.134.8/29 dev eth1 proto kernel scope link src 192.41.134.10
192.41.134.13 dev eth0 scope link
Now we can see there’s a separate specific route added to our destination IP from eth0. Let’s try to ping it again:
$ ping 192.41.134.13
PING 192.41.134.13 (192.41.134.13) 56(84) bytes of data.
64 bytes from 192.41.134.13: icmp_seq=1 ttl=64 time=0.267 ms
64 bytes from 192.41.134.13: icmp_seq=2 ttl=64 time=0.215 ms
64 bytes from 192.41.134.13: icmp_seq=3 ttl=64 time=0.230 ms
64 bytes from 192.41.134.13: icmp_seq=4 ttl=64 time=0.246 ms
64 bytes from 192.41.134.13: icmp_seq=5 ttl=64 time=0.258 ms
So now our ping is working, and we can reach the destination IP. Let’s also check the traffic on the interface:
$ tcpdump -n -i eth0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
15:55:44.410450 IP 192.41.134.11 > 192.41.134.13: ICMP echo request, id 9051, seq 523, length 64
15:55:44.410471 IP 192.41.134.13 > 192.41.134.11: ICMP echo reply, id 9051, seq 523, length 64
15:55:45.434489 IP 192.41.134.11 > 192.41.134.13: ICMP echo request, id 9051, seq 524, length 64
15:55:45.434518 IP 192.41.134.13 > 192.41.134.11: ICMP echo reply, id 9051, seq 524, length 64
15:55:46.458490 IP 192.41.134.11 > 192.41.134.13: ICMP echo request, id 9051, seq 525, length 64
15:55:46.458520 IP 192.41.134.13 > 192.41.134.11: ICMP echo reply, id 9051, seq 525, length 64
We can see here our ping requests and responses are going through the eth0 interface after we’ve set the specific route.
7. Conclusion
In this article, we covered some core concepts in the IP routing process. We discussed the Linux routing table, how routing decisions work, and how to manipulate the routing table and forward traffic to a specific IP over a specific network interface.
The Linux routing table stores information about destination subnets on the network and how to reach them. It controls how the machine forwards network traffic to its destination. Routes are added to routing tables using static configuration or through the directly connected networks of the interfaces.
The routing table checks the destination IP of the traffic and searches for a matching route. If it finds one, it forwards the traffic over the corresponding network interface. The routing table follows the longest matching prefix rule if it finds multiple routes matching the destination IP.
We can use this longest matching prefix rule to route the traffic over a specific network interface by adding a route to the destination with a prefix of /32 that will precede any other route.