1. Introduction
Sometimes when we try to access a file, we may experience the message the file is busy. This means that the system is running a process that is using the file, keeping it open either for reading or writing. When this happens, sometimes we’ll want to discover the process that uses the file.
In this tutorial, we’ll look at how to find the process that is using a file.
2. Methods and Commands to Find the Process
There are a couple of commands which can help us to find processes that operate on files, so we’ll start there. These commands gather data from the Linux kernel because it is responsible for running the processes, file systems, among other things. Additionally, we’ll read the kernel tables directly to get the needed info.
2.1. The fuser Command
Let’s start with the fuser command that lists processes using files or sockets. It can also be used to kill a process. We can use it with the -v parameter to get a verbose output:
$ fuser -v text.txt
USER PID ACCESS COMMAND
/home/john/text.txt:
john 22829 f.... less
As we can see, in this case, the less process is accessing the file. *The fuser command returns the PID, the user who called the process, and the file states*.
Running the command with the -k option will kill the process that it finds. Let’s give it a try to kill the less process, with SIGKILL, using the PID 24815:
$ fuser -k text.txt
/home/john/text.txt: 24815
Let’s say the same file is being accessed by vi. When we run the same command, nothing will be returned because vi opens the file, reads its content to memory, and closes it. The kernel did its job already, so information on the file is not available.
However, we can try finding the process by analyzing and guessing the output of fuser -cv text.txt. The response is a list of all processes that are accessing files on the same file system. In this case, the last line of the output is the process we’re looking for. However, that may not always be the case.
$ fuser -cv text.txt
USER PID ACCESS COMMAND
/home/john/text.txt:
root kernel mount /home
...
john 24807 F.c.. vi
Running the command with the -k option may kill all processes that use the pointed file or directory, so use it with care.
2.2. The lsof Command
The lsof command can return a list of open files. To narrow down the results and keep the heading line, we’ll use it with the head and grep commands. Assuming vi is still running, let’s give it a try:
$ lsof | { head -1 ; grep text.txt ; }
COMMAND PID TID TASKCMD USER FD TYPE DEVICE SIZE/OFF NODE NAME
vi 24807 john 4u REG 8,3 12288 3147621 /home/john/.text.txt.swp
The lsof command returns the process name, the PID, and the user who is running the process. If the process has threads, we’ll see their identification number, TID, with the task command. The FD field can have three parts: file descriptor (4 in our case) is the first, a mode character is the second (u means the file is accessible for reading and writing), and a lock character is the third.
Let’s look at the output when the less command is accessing the file, instead of vi:
$ lsof | { head -1 ; grep text.txt ; }
COMMAND PID TID TASKCMD USER FD TYPE DEVICE SIZE/OFF NODE NAME
less 28423 john 4r REG 8,3 75 3146117 /home/john/text.txt
In this case, we see that the file is opened for reading, with FD = 4r.
Just as with fuser, however, if we’re using vi to edit the file, lsof won’t show it as in use.
The command has plenty of options. For instance, the -t gives only process identifiers without a header, making it helpful in writing scripts.
lsof is very similar to fuser, except it can’t kill the processes. However, because lsof also gives us the PID, we can join it with the kill command:
$ kill -TERM `lsof -t text.txt`
To discover PIDs of all processes using files in a directory and below, we can recursively scan it with +D.
Additionally, lsof is frequently used to find all files opened by a given (by PID) process:
$ lsof -p <PID>
2.3. Getting Information From the Kernel Directly
Another way of detecting the process of a file in use is by accessing the kernel directly. The kernel keeps the data under /proc. Information about a process is in the directory /proc/<pid_the_process>. It contains entries for everything opened by the process file, named by its file descriptor, which is linked to the actual file.
Therefore, we only need to use the ls command:
ls -l /proc/*/fd
Let’s improve these results with a script that only prints the PID:
#!/usr/bin/bash
for pid in /proc/{0..9}*; do
i=$(basename "$pid")
for file in "$pid"/fd/*; do
link=$(readlink -e "$file")
if [ "$link" ]; then
echo "PID $i: $link"
fi
done
done | grep $1
For each PID in dir /proc/pid, we’re digging into the directory and then to the subdirectory, fd. Then we read the links. If this link is a file, we print its name. Finally, grep the result with the given file name.
If we save this script as mylsof and permit it to execute (chmod u+x mylsof), we can use it to solve our problem:
$ ./mylsof text.txt
PID 30069: /home/john/text.txt
Getting information from the kernel always works, even on systems with BusyBox. It can help if neither lsof nor fuser is in the system.
3. Conclusion
In this article, we’ve looked at finding a process that is accessing a file. We started with the fuser command, and then we looked at using lsof. We also looked into the kernel to select needed data.