1. Introduction
In this tutorial, we’ll explore several Linux tools to monitor disk space.
2. Filesystems and Mounts
First, it’s important to understand the difference between filesystems and mounts.
When we talk about filesystems, we mean a partition of a hard disk that can be used by the operating system. Filesystems can also be network-based, in-memory, USB drives, and special temporary spaces.
If we are able to access a filesystem on an operating system, we say the filesystem is “mounted”, or available in a directory. The directory a filesystem is mounted onto is called its “mount point”.
The “root” directory is the base directory of the operating system and is located at /. Therefore, operating systems always have at least one filesystem mounted at /.
In general, we talk about all directories in the filesystem relative to the root directory.
3. df – Filesystem Usage
df is the command we’ll use to look at filesystem usage.
Let’s try running this in a shell:
user@host:~$ df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 1005792 0 1005792 0% /dev
tmpfs 204824 22140 182684 11% /run
/dev/sda 24543644 15646380 7635696 68% /
tmpfs 1024108 0 1024108 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 1024108 0 1024108 0% /sys/fs/cgroup
tmpfs 204824 0 204824 0% /run/user/1000
We can see from the column headers what each value means. Importantly, the columns that have to do with sizes will change in scale depending on the arguments that we pass to df.
For example, let’s try df -h. The -h will display the output in human-readable format in powers of 1024:
Filesystem Size Used Avail Use% Mounted on
udev 983M 0 983M 0% /dev
tmpfs 201M 22M 179M 11% /run
/dev/sda 24G 15G 7.4G 67% /
tmpfs 1001M 0 1001M 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 1001M 0 1001M 0% /sys/fs/cgroup
tmpfs 201M 0 201M 0% /run/user/1000
We see in the output above the sizes have a letter after them for byte, kilobyte, megabyte, gigabyte, terabyte, and petabyte in order to reduce the number of digits to three or fewer.
df gives us a number of options, so we can refer to the man page to see what is available.
4. du – Directory Usage
du is the command we’ll use to look at directory sizes.
Let’s imagine our terminal is located where there are three directories and a file:
$ ls -l
total 20
drwxr-xr-x 2 mike sudo 4096 Nov 3 20:15 one
-rw-r--r-- 1 mike sudo 6 Nov 4 12:37 test.txt
drwxr-xr-x 2 mike sudo 4096 Nov 3 20:16 three
drwxr-xr-x 2 mike sudo 4096 Nov 3 20:15 two
With ls, we can see each directory size is 4096 bytes, which is the size of the directory’s pointer on disk and not the size used by the directory and its contents.
Next, we’ll use du to see the size of the directory and its contents:
$ du
1024008 ./three
122888 ./two
10244 ./one
1157144 .
Like df, du also supports -h for human-readable formatting
$ du -h
1001M ./three
121M ./two
11M ./one
1.2G .
du also gives us a number of other options not covered here.
5. Finding Large Files
We’ve seen how to use df and du to get summaries of how much space is used on our system.
However, when it’s time to free up some space on our machine we need to know specifically what files are taking up all the space. This is where we can make use of the ls and find commands.
5.1. Example Files
First, let’s look at the directory tree that we are using in the following examples:
Size Path and File
---- ------------------------------
6 ./test.txt
1000M ./three/output4.dat
10M ./two/output2.dat
100M ./two/two-a/more/output3.dat
10M ./two/output1.dat
10M ./one/output.dat
5.2. Using ls
As we saw above, we can use ls to view the contents of a directory, and if we use the l option we can see the file sizes in a directory. This is useful, but can quickly become cumbersome if there are a lot of directories to look into. Let’s look at using ls -lh to find large files in the three directories in our example:
$ ls -lh one/
total 10M
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 output.dat
We can see one 10-megabyte file in the directory one.
Now let’s look at directory two, which is using 121 megabytes according to du:
$ ls -lh two
total 21M
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 output1.dat
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 output2.dat
drwxr-xr-x 3 mike sudo 4.0K Nov 4 13:29 two-a
We can see two 10 megabyte files in two, plus a sub-directory called two-a. We are still looking for 100 megabytes of files.
Let’s look at two-a to see what is in there:
$ ls -lh two/two-a/
total 4.0K
drwxr-xr-x 2 mike sudo 4.0K Nov 4 13:30 more
There is another directory in two-a, and we still have not found the missing 100 megabytes.
It would be nice if we had a recursive tool to find files and print their sizes!
5.3. Using find
The find command is the way to see all files in a directory tree and their sizes.
To print all files and sizes under a directory, we can do something like this:
$ find . -type f -exec ls -lh {} \;
-rw-r--r-- 1 mike sudo 6 Nov 4 12:37 ./test.txt
-rw-r--r-- 1 mike sudo 1000M Nov 3 20:16 ./three/output4.dat
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 ./two/output2.dat
-rw-r--r-- 1 mike sudo 100M Nov 3 20:15 ./two/two-a/more/output3.dat
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 ./two/output1.dat
-rw-r--r-- 1 mike sudo 10M Nov 3 20:15 ./one/output.dat
This shows us each file in the directory tree and its size. Of course, there are many ways to do this with find, but this simple one prints all files and their sizes.
Let’s say we wanted to only see files that have a size greater than 20 megabytes with find:
$ find . -type f -size +20M -exec ls -lh {} \;
-rw-r--r-- 1 mike sudo 1000M Nov 3 20:16 ./three/output4.dat
-rw-r--r-- 1 mike sudo 100M Nov 3 20:15 ./two/two-a/more/output3.dat
find is a very powerful tool, so we can check the man page if we need to figure out more options.
6. Conclusion
In this tutorial, we have seen how to check disk space for both filesystems and files or directories. We also looked at a couple of utilities to find some details about files that are taking up space.