1. Overview
Network monitoring and troubleshooting is a crucial skill for users and administrators. It helps identify and resolve bottlenecks that can impact the overall system performance and availability.
However, when the network scales and becomes more complex, troubleshooting bottlenecks and understanding performance behavior can become difficult. Luckily, we have some user-friendly tools that make it easier to detect and troubleshoot network issues in our environment.
In this tutorial, we’re going to discuss one such tool that we can use on Windows, which is tracert, and its equivalent on Linux systems, which is traceroute. We’ll cover their basic concepts, and how to use them, then we’ll dive into how they work under the hood.
2. What Is tracert?
tracert is a Windows command-line utility that detects the path of a packet sent over the network for a specific destination. By the path of a packet we mean the layer 3 devices or hops that the packet has traversed going to its destination. This is useful in understanding the direction that the traffic takes for the destination.
tracert breaks down this information and provides each individual hop that the packet has crossed, and it also determines the latency caused by each of these hops.
The tracert command is installed by default on most modern Windows systems. We can simply type tracert followed by the destination IP address or hostname:
> tracert google.com
Tracing route to google.com [172.217.171.206]
over a maximum of 30 hops:
1 4 ms 8 ms 4 ms 192.168.246.111
2 * * * Request timed out.
3 * * * Request timed out.
4 * * * Request timed out.
5 71 ms 28 ms 39 ms 41.206.148.146
6 * * * Request timed out.
7 232 ms 43 ms 40 ms host-81.10.87.42.tedata.net [81.10.87.42]
8 86 ms 66 ms 69 ms et-1-0-9-0.cr6-mrs1.ip4.gtt.net [154.14.74.77]
-------- OUTPUT TRIMMED --------
Trace complete
Here we’ve used the command to check the path to google.com hostname. We can see a list of hops that the packet has traversed along with the latency for each one. We can also see some hops that are counted but don’t show their specific information. This might be an issue with a device or a specific configuration.
3. tracert Equivalent in Linux
Similar to what tracert does, we have a Linux command that also tracks the path which a specific packet has traversed to its destination, which is the traceroute command.
traceroute is almost identical in its functionality to tracert. We can use it in the same way, and it displays the same information. So, the major difference between the two is the underlying operating system that supports the command.
Unlike tracert, the traceroute command might not be installed by default on a lot of Linux distributions. So we need to verify first if it exists on our machine:
$ traceroute
-bash: traceroute: command not found
We can see here that the command is missing, so we can simply use our package manager to install it:
$ sudo apt install traceroute
Here we used the apt package manager to install traceroute, let’s verify the installation again:
$ traceroute
Usage:
traceroute [ -46dFITnreAUDV ] [ -f first_ttl ] [ -g gate,... ] [ -i device ] [ -m max_ttl ] [ -N squeries ] [ -p port ] [ -t tos ] [ -l flow_label ] [ -w MAX,HERE,NEAR ] [ -q nqueries ] [ -s src_addr ] [ -z sendwait ] [ --fwmark=num ] host [ packetlen ]
Options:
-------- OUTPUT TRIMMED --------
Now we can see the command is working and showing the available options.
For the rest of this tutorial, we’ll be working with Linux traceroute, but the same information should also apply to tracert.
4. Using the traceroute Command
The traceroute command is very simple to use. Similarly to how we used tracert, we can type traceroute followed by the destination IP or hostname:
$ traceroute google.com
traceroute to google.com (172.217.212.139), 30 hops max, 60 byte packets
1 172.25.0.1 (172.25.0.1) 0.049 ms 0.009 ms 0.009 ms
2 * * *
3 72.14.236.212 (72.14.236.212) 1.542 ms 142.250.232.66 (142.250.232.66) 1.066 ms 172.253.75.242 (172.253.75.242) 1.568 ms
4 * 142.251.230.48 (142.251.230.48) 0.996 ms 192.178.44.147 (192.178.44.147) 1.077 ms
-------- OUTPUT TRIMMED --------
Here again, we used the command to check the path to google.com hostname. The path is identified starting from our local gateway and going through each hop or router. We can see the latency that each router adds for the packet, just like we saw in tracert.
5. Understanding ICMP and TTL
Before we get into the details of how traceroute works, we need to understand two networking concepts that traceroute relies on, which are ICMP and TTL.
ICMP is a network layer protocol that we can use to test reachability to a destination on the network. It is the underlying protocol that the ping command uses to verify connectivity. At its simplest form, ICMP sends a request message to a destination IP and waits for a response. If the response is received correctly, this marks the destination as reachable, and if an issue arises, this indicates a problem in the connection.
On the other hand, TTL or Time-To-Live is a header field that is added to IP packets going on the network to prevent routing loops. Sometimes a network misconfiguration or a design error causes packets to keep moving forever inside the network. So to avoid this type of infinite loop, the TTL header sets the maximum number of hops or routers the packet can traverse.
The way it works is that the TTL header on the packet will contain a specific number, and when the packet crosses any router, it will decrement this number by one. If this process continues until the TTL reaches zero, then the packet will be dropped. This way, we can prevent packets from infinitely routing through the network.
6. How traceroute Works Under the Hood
traceroute uses ICMP and TTL to detect the path of the packets. It sends an ICMP request multiple times and decrements the TTL header each time. Routers along the path that receive this ICMP request will decrease the TTL as normal. The router that drops the message when the TTL reaches zero will reply back with a TTL exceeded message.
Let’s understand this with an example. Assume we have a destination server that we want to reach from our machine. The path between our machine and the server contains three routers, R1, R2, and R3. Now we want to execute a traceroute from our machine to the server to understand the path that the packet will traverse.
When we type traceroute and the server name, our machine will send an ICMP request with the TTL set to 1. The idea here is that our machine wants to know what is the first hop in the path. So, when this packet reaches R1, it will decrease the TTL by 1, and it will become zero. Now R1 will reply to our machine with a TTL exceeded message, and our machine will record the IP of R1.
Our machine will then send an ICMP message again, this time with a TTL of 2. When this message reaches R1, it will decrement the TTL to 1, then it will send the packet to R2. When R2 receives the packet, it will decrement the TTL by 1, so it will become zero. Again, R2 will reply to our machine with a TTL exceeded message, and our machine will record the IP of R2.
This process will continue until our machine receives a reply from the server itself. At that point, it will have figured out all the routers along the path to the server.
7. Understanding the Output of traceroute
Now that we’ve covered the inner workings of traceroute, let’s inspect its output. We’ll again use the google.com hostname for the example:
$ traceroute google.com
traceroute to google.com (74.125.132.113), 30 hops max, 60 byte packets
1 172.25.0.1 (172.25.0.1) 0.044 ms 0.013 ms 0.007 ms
2 142.250.232.56 (142.250.232.56) 0.932 ms 142.250.232.98 (142.250.232.98) 0.915 ms 142.250.232.56 (142.250.232.56) 0.885 ms
3 142.251.230.179 (142.251.230.179) 1.829 ms 142.251.230.48 (142.251.230.48) 1.017 ms 192.178.44.129 (192.178.44.129) 0.944 ms
4 172.253.77.134 (172.253.77.134) 2.982 ms 3.654 ms 209.85.241.230 (209.85.241.230) 3.815 ms
5 172.253.79.125 (172.253.79.125) 1.804 ms 209.85.242.159 (209.85.242.159) 2.293 ms 216.239.40.142 (216.239.40.142) 1.348 ms
-------- OUTPUT TRIMMED --------
Each line in the traceroute output provides three main pieces of information: the hop count, the IP address, and the latency. The hop count is a number that arranges each router in the order that the packet has traversed. Here we have a list arranged from 1 to 5 for the hop count.
Next to the hop count, we have the IP address. This identifies a device in the path that the packet crossed. For example, here, our packet has passed through 172.25.0.1 as the first device in the path. This hop usually indicates the default gateway of our machine.
We also notice that some lines are showing more than one IP address in a single hop. For example, the second hop shows 142.250.232.56 and 142.250.232.98 in the same line. This might happen if one of the hops has multiple paths to the destination. For example, it might be connected to two different upstream routers to the destination. So each time, the ICMP message may have taken a different path along the way.
Finally, we have the latency, this is simply the time it takes for the packet to reach this specific hop. This can be particularly useful when troubleshooting network performance issues. We can do a traceroute and check which hop adds the most latency. This isolates where the performance bottlenecks are introduced.
8. Conclusion
In this article, we’ve covered how to use traceroute to track the path of a packet to a destination.
traceroute is a Linux command that lists the hops or devices along the path that the packet traversed. It uses ICMP to send a request to the destination and adjusts the TTL in the packet header to receive a response from each device. The output of traceroute provides information like the device IP address and latency. We can then use this information to troubleshoot network issues or understand performance behaviors.