1. Introduction
When troubleshooting high input/output utilization on Linux, we often need to identify files or directories responsible for the utilization.
In this tutorial, firstly, we’ll explore some of the available tools on Linux to troubleshoot high input/output usage. Secondly, we’ll see which tools can help us determine a specific high I/O file.
2. iotop
iotop is a Linux utility that monitors input/output usage as reported by the Linux kernel and displays the information by process or thread. While it doesn’t show usage by file or directory, iotop can be useful to identify the process with the highest input/output utilization. After that, we can identify which files are opened by the process.
2.1. Installation
iotop is available in most distribution repositories. To install iotop on Debian-based distributions like Ubuntu and Linux Mint, we can use apt:
$ sudo apt-get install iotop
For RHEL, CentOS, RockyLinux, Almalinux, and Fedora, we can use yum:
$ sudo yum install iotop
Next, let’s see how we can use iotop to view system input/output utilization.
2.2. Usage
At this point, we can run the interactive iotop interface:
$ sudo iotop
Total DISK READ : 2.45 M/s | Total DISK WRITE : 29.50 K/s
Actual DISK READ: 2.42 M/s | Actual DISK WRITE: 106.93 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
2600 be/6 root 2.26 M/s 0.00 B/s 0.00 % 81.11 % monarx-agent
429 be/3 root 0.00 B/s 3.69 K/s 0.00 % 1.06 % [jbd2/dm-0-8]
675 be/3 root 0.00 B/s 0.00 B/s 0.00 % 0.92 % auditd
2738 be/6 root 184.37 K/s 0.00 B/s 0.00 % 0.84 % monarx-agent
3989412 be/4 wazuh 0.00 B/s 0.00 B/s 0.00 % 0.74 % wazuh-agentd
674 be/3 root 0.00 B/s 11.06 K/s 0.00 % 0.00 % auditd
715 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % rsyslogd -n [in:imjournal]
1 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % systemd --switched-root --system --deserialize 22
2 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kthreadd]
iotop displays columns for the I/O bandwidth that each process or thread has read or written for a given period. In addition, the total I/O bandwidth read and written is displayed at the top of the interface.
By default, iotop sorts the results by the percentage of time a process has spent waiting on I/O operations (IO column). This isn’t particularly useful to us. What we want is to sort the results by disk read and write volume. So, we can use the left and right keyboard arrows to change the sort column. For example, to sort the results by the highest DISK WRITE volume, we can hit the left arrow twice.
In addition, by default iotop shows the Thread ID (TID column). To switch to showing the process ID instead, we can hit p interactively or add the command-line option -P to achieve the same result.
Furthermore, we can ask iotop to only show processes and threads that are actively performing I/O operations by using the –only command option:
$ sudo iotop -P --only
Total DISK READ : 0.00 B/s | Total DISK WRITE : 542.42 M/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 537.73 M/s
PID PRIO USER DISK READ DISK WRITE> COMMAND
29814 be/4 root 0.00 B/s 542.42 M/s vuls
25596 be/4 root 0.00 B/s 0.00 B/s [kworker/u2:1-writeback]
Once we have identified the process ID (PID) responsible for the highest input/output rate, we can use the next utility to identify files and directories opened by the process.
3. lsof
lsof is a Linux utility that lists information about files and directories opened by a process.
3.1. Installation
lsof is available on most distribution repositories. To install it on Debian-based distributions, we can use apt:
$ sudo apt-get install lsof
For RHEL, CentOS, and other RPM distributions, we can use yum:
$ sudo yum install lsof
Next, we’ll look into how to use lsof to identify files opened by a process.
3.2. Usage
Running lsof without any options lists all open files by any active process. However, we can filter the list by using the -p option and a PID.
In our previous example, the PID for the process with the high disk write is 29814. Therefore, let’s use lsof to identify files opened by process 29814:
$ lsof -p 29814
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
vuls 29814 root cwd DIR 8,1 142 16777345 /root
vuls 29814 root rtd DIR 8,1 239 128 /
vuls 29814 root txt REG 8,1 79832 50641304 /usr/bin/dd
vuls 29814 root mem REG 8,1 2586930 50332817 /usr/lib/locale/en_US.utf8/LC_COLLATE
vuls 29814 root mem REG 8,1 2093744 9902 /usr/lib64/libc-2.28.so
vuls 29814 root mem REG 8,1 1121744 9895 /usr/lib64/ld-2.28.so
vuls 29814 root mem REG 8,1 337024 33576656 /usr/lib/locale/C.utf8/LC_CTYPE
vuls 29814 root mem REG 8,1 54 50332820 /usr/lib/locale/en_US.utf8/LC_NUMERIC
vuls 29814 root mem REG 8,1 3316 9861 /usr/lib/locale/en_US.utf8/LC_TIME
vuls 29814 root mem REG 8,1 286 9883 /usr/lib/locale/en_US.utf8/LC_MONETARY
...
vuls 29814 root 0r CHR 1,5 0t0 9421 /dev/zero
vuls 29814 root 1w REG 8,1 5920464896 16777347 /root/largefile
vuls 29814 root 2u CHR 136,1 0t0 4 /dev/pts/1
To identify regular files in the above output, we can use the following hints:
- regular files have a digit (File Descriptor Number) followed by a single letter (mode) in the FD column
- regular files have REG under the TYPE column
The possible modes under which the file is open include:
- r for read access
- w for write access
- u for read and write access
Applying the above hints, we can safely say that the file with the high input/output is /root/largefile.
While this method isn’t conclusive, since a process might have multiple files open, it at least narrows down the options to a handful of files. The next tool we’ll look at promises a more conclusive result.
4. sysdig
sysdig is an open-source universal system visibility tool. In addition, it comes with c**sysdig, a curses UI for sysdig.
4.1. Installation
To install sysdig automatically in one step, we can use curl:
$ curl -s https://download.sysdig.com/stable/install-sysdig | sudo bash
For other installation methods, we can consult the project Wiki.
4.2. Usage
sysdig’s chisels are small scripts that produce actions based on the sysdig event stream. For example, one of the available chisels for our needs is topfiles_bytes, which shows the top files by read/write bytes to disk.
Let’s run sysdig specifying the chisel topfiles_bytes using the -c command option:
$ sudo sysdig -c topfiles_bytes
Bytes Filename
--------------------------------------------------------------------------------
178.13M /dev/zero
178.12M /root/largefile
233B /dev/ptmx
Notably, sysdig provides this information system-wide, i.e., we don’t need to identify a process first and then look through open files.
Chisels can be combined with filters, which usually makes them much more useful. For example, let’s say we’re not interested in I/O involving /dev. If so, we can filter it out with something like this:
$ sudo sysdig -c topfiles_bytes "not fd.name contains /dev"
Bytes Filename
--------------------------------------------------------------------------------
268.97M /root/largefile
In addition, we can use the fd.name filter to restrict the results to a certain path or directory:
$ sudo sysdig -c topfiles_bytes "fd.name contains /root"
Bytes Filename
--------------------------------------------------------------------------------
268.97M /root/largefile
Furthermore, we can use the proc.name filter to restrict the results to a specific process name:
$ sudo sysdig -c topfiles_bytes "proc.name=vuls"
Bytes Filename
--------------------------------------------------------------------------------
183.39M /dev/zero
183.39M /root/largefile
This way, we can check suspects and get data on particular files.
5. Conclusion
In this article, we explored different ways to troubleshoot high input/output utilization in Linux and identify specific files responsible for that high input/output. The first method used a combination of iotop and lsof, but is less conclusive. Meanwhile, the second method used sysdig and provided more conclusive results.