1. Overview

Linux systems make use of the disk cache mechanism. As we can access data from memory faster than from the hard disk, keeping the file’s content in RAM improves performance. Although the kernel handles this task well on its own, we can interfere with this process ourselves.

In this tutorial, we’ll learn how to check if the file is cached. Moreover, we’ll discover how to on-demand put, remove, or lock the file in memory.

2. The Disk Cache

Let’s notice that the kernel usually caches a file during its first use. This is due to assumptions that more readings from this file may soon follow. So, let’s check it in a very simple way with an around 2 GB test_file:

$ time cat test_file > /dev/null

real    0m18,873s
user    0m0,030s
sys    0m1,395s

Now, let’s immediately repeat the command:

$ time cat test_file > /dev/null

real    0m0,333s
user    0m0,004s
sys    0m0,328s

Thanks to the kernel’s caching of the file during the first use, the second operation is significantly faster. Of course, we don’t need to repeat the same command. Any subsequent access to this file will be quicker. However, we can’t control when the kernel removes the file from memory.

Now, let’s learn the amount of RAM used to cache files from the Cached entry in the /proc/meminfo file. So, let’s check this figure before caching test_file:

$ cat /proc/meminfo | grep ^Cached
Cached:           997396 kB

Then, let’s find out that after running cat, the cached amount increases roughly by the file’s size:

$ cat /proc/meminfo | grep ^Cached
Cached:          2985868 kB

Finally, we can ask the kernel to clear the cache without pointing to a particular file:

$ echo 1 | sudo tee /proc/sys/vm/drop_caches

3. The vmtouch Command

Let’s use vmtouch to control and monitor how the file is cached. So we can check how much of the file is already cached. Moreover, we can decide whether to put or remove a file from the memory. Finally, we can lock a file in memory.

Let’s notice that this utility comes in a package of the same name on Ubuntu. Finally, let’s check the version and basic syntax:

$ vmtouch
vmtouch: no files or directories specified

vmtouch v1.3.1 - the Virtual Memory Toucher by Doug Hoyte
Portable file system cache diagnostics and control

Usage: vmtouch [OPTIONS] ... FILES OR DIRECTORIES ...

# ...

4. How Much of the File Is in the Memory?

Let’s return to our big test_file and check if it’s cached:

$ vmtouch test_file
           Files: 1
     Directories: 0
  Resident Pages: 0/1970848  0/7G  0%
         Elapsed: 0.044178 seconds

Let’s notice that the command informs about the number of checked files and folders. Then comes the number of used and required memory pages in the Resident Pages field. Next, we can find the same information expressed in bytes. Finally, the command reports its execution time.

Now, let’s tail some amount of this file:

$ tail -n 100000 test_file > /dev/null

Then, let’s check how much of this file the kernel has cached:

$ vmtouch test_file
           Files: 1
     Directories: 0
  Resident Pages: 6416/1970848  25M/7G  0.326%
         Elapsed: 0.047087 seconds

Finally, let’s print the whole file with cat and check it again:

$ cat test_file > /dev/null
$ vmtouch test_file
           Files: 1
     Directories: 0
  Resident Pages: 1970848/1970848  7G/7G  100%
         Elapsed: 0.10013 seconds

5. Caching and Uncaching Files Manually

Next, instead of tricking the kernel with tail or cat, let’s cache the file directly with the t option. In addition, we’re going to turn on the verbose mode with v:

$ vmtouch -vt test_file
test_file
[o                                                           ] 7937/492712
[OOOOOOOOOOOOOOOo                                            ] 129281/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOo                                  ] 210049/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo                      ] 311137/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo            ] 388929/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo     ] 449601/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 492712/492712

           Files: 1
     Directories: 0
   Touched Pages: 492712 (1G)
         Elapsed: 17.382 seconds

Let’s notice that we’ve obtained the stills of the progress bar just by repeatedly pressing Enter.

Finally, let’s evict the file from the cache with the e option:

$ vmtouch -ev test_file
Evicting test_file

           Files: 1
     Directories: 0
   Evicted Pages: 1970848 (7G)
         Elapsed: 0.53841 seconds

In addition, let’s highlight that we can remove any file, not only the one cached by this command.

6. Locking File in Memory

Now let’s force the file to stay in the memory. So, we need to use the l option to lock the file:

$ vmtouch -vtl test_file
test_file
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 492712/492712
LOCKED 492712 pages (1G)

Let’s notice that now the command keeps running and blocks the terminal. Thus, let’s open another one and check the file:

$ vmtouch test_file
           Files: 1
     Directories: 0
  Resident Pages: 492712/492712  1G/1G  100%
         Elapsed: 0.040164 seconds

Now, we can’t drop the file from memory until we stop the program by pressing Ctrl-C in the terminal. Thus, we can’t either evict it with vmtouch or by clearing the cache system-wide.

To facilitate unlocking the file, we can use the P option with a file name that is going to contain the PID of the corresponding process:

$ vmtouch -tl -P PID_vmtouch test_file &

After the file is cached, we can kill the process end evict the file:

$ kill '$(cat PID_vmtouch)'
$ vmtouch -e test_file

6.1. Deamonizing vmtouch

As an addition to the locking feature, we can move vmtouch in the background to act as a daemon. In this way, we’re going to keep a file or files in memory for further use. So, let’s use the d option:

$ vmtouch -tld -P PID_vmtouch test_file

Of course, we need the process ID to stop the daemon eventually.

7. Crawling Through Directories

With vmtouch, we can recursively travel the directories and perform tasks on encountered files. So, let’s check if the files of the commands manuals are cached:

$ sudo bash -c "vmtouch -f /usr/share/man"
# some warnings

           Files: 7843
     Directories: 121
  Resident Pages: 11413/11413  44M/44M  100%
         Elapsed: 0.074808 seconds

Let’s notice the f option to follow symbolic links. So we’ve found out that all manuals are in the memory now. Next, let’s drop them:

$ sudo bash -c "vmtouch -fe /usr/share/man"

Then, let’s check again:

$ sudo bash -c "vmtouch -f /usr/share/man"
# ...
           Files: 7843
     Directories: 121
  Resident Pages: 0/11413  0/44M  0%
         Elapsed: 0.074969 seconds

Now let’s ask for some manuals, e.g., man sudo and man less, and then do check one more time:

$ sudo bash -c "vmtouch -fv /usr/share/man"
# ...     

/usr/share/man/man1/less.1.gz     
[OOOOOO] 6/6     
[OOOOOO] 6/6     

# ...     

/usr/share/man/man8/sudoedit.8.gz     
[OOO] 3/3     
[OOO] 3/3     

# ...     

Files: 7843     
Directories: 121     
Resident Pages: 18/11413 72K/44M 0.158%

So, now the kernel is using 6 pages for the less manual, 3 pages for sudo, and 9 for other files.

8. The fincore Command

Now let’s use the fincore command from the util-linux package to find how much of the file is located in the memory. Now, let’s check all files in the current folder:

$ fincore *
  RES  PAGES   SIZE FILE
   0B      0   1,9G test_file
 840K    210   1,9G test_file1
 1,3G 342232   1,9G test_file2
 1,9G 492712   1,9G test_file3
 840K    210   1,9G test_file4
 2,7M    698   1,9G test_file5

So, we prepared a set of identical files in the current folder. Then, we used a mix of tail and cat commands to trick the kernel to cache files.

Next, let’s learn by fincore –help the meaning of columns. RES reports the cached amount of the file in bytes, while PAGES do the same but in the number of pages. Finally, SIZE is the file’s disk size, and FILE is its name.

Unlike vmtouch, fincore only reports cache usage per file.

9. Conclusion

In this tutorial, we learned commands to monitor and manage loading files in the disk cache. First, we checked how much of a particular file is already cached. Then, we dealt with manually placing and removing files from memory. Next, we locked cached files, making any further access to them faster.

Finally, let’s emphasize the kernel handles file caching well enough for all standard use. So, we should resort to manual control only for special developing or debugging purposes.