1. Overview
Linux systems make use of the disk cache mechanism. As we can access data from memory faster than from the hard disk, keeping the file’s content in RAM improves performance. Although the kernel handles this task well on its own, we can interfere with this process ourselves.
In this tutorial, we’ll learn how to check if the file is cached. Moreover, we’ll discover how to on-demand put, remove, or lock the file in memory.
2. The Disk Cache
Let’s notice that the kernel usually caches a file during its first use. This is due to assumptions that more readings from this file may soon follow. So, let’s check it in a very simple way with an around 2 GB test_file:
$ time cat test_file > /dev/null
real 0m18,873s
user 0m0,030s
sys 0m1,395s
Now, let’s immediately repeat the command:
$ time cat test_file > /dev/null
real 0m0,333s
user 0m0,004s
sys 0m0,328s
Thanks to the kernel’s caching of the file during the first use, the second operation is significantly faster. Of course, we don’t need to repeat the same command. Any subsequent access to this file will be quicker. However, we can’t control when the kernel removes the file from memory.
Now, let’s learn the amount of RAM used to cache files from the Cached entry in the /proc/meminfo file. So, let’s check this figure before caching test_file:
$ cat /proc/meminfo | grep ^Cached
Cached: 997396 kB
Then, let’s find out that after running cat, the cached amount increases roughly by the file’s size:
$ cat /proc/meminfo | grep ^Cached
Cached: 2985868 kB
Finally, we can ask the kernel to clear the cache without pointing to a particular file:
$ echo 1 | sudo tee /proc/sys/vm/drop_caches
3. The vmtouch Command
Let’s use vmtouch to control and monitor how the file is cached. So we can check how much of the file is already cached. Moreover, we can decide whether to put or remove a file from the memory. Finally, we can lock a file in memory.
Let’s notice that this utility comes in a package of the same name on Ubuntu. Finally, let’s check the version and basic syntax:
$ vmtouch
vmtouch: no files or directories specified
vmtouch v1.3.1 - the Virtual Memory Toucher by Doug Hoyte
Portable file system cache diagnostics and control
Usage: vmtouch [OPTIONS] ... FILES OR DIRECTORIES ...
# ...
4. How Much of the File Is in the Memory?
Let’s return to our big test_file and check if it’s cached:
$ vmtouch test_file
Files: 1
Directories: 0
Resident Pages: 0/1970848 0/7G 0%
Elapsed: 0.044178 seconds
Let’s notice that the command informs about the number of checked files and folders. Then comes the number of used and required memory pages in the Resident Pages field. Next, we can find the same information expressed in bytes. Finally, the command reports its execution time.
Now, let’s tail some amount of this file:
$ tail -n 100000 test_file > /dev/null
Then, let’s check how much of this file the kernel has cached:
$ vmtouch test_file
Files: 1
Directories: 0
Resident Pages: 6416/1970848 25M/7G 0.326%
Elapsed: 0.047087 seconds
Finally, let’s print the whole file with cat and check it again:
$ cat test_file > /dev/null
$ vmtouch test_file
Files: 1
Directories: 0
Resident Pages: 1970848/1970848 7G/7G 100%
Elapsed: 0.10013 seconds
5. Caching and Uncaching Files Manually
Next, instead of tricking the kernel with tail or cat, let’s cache the file directly with the t option. In addition, we’re going to turn on the verbose mode with v:
$ vmtouch -vt test_file
test_file
[o ] 7937/492712
[OOOOOOOOOOOOOOOo ] 129281/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOo ] 210049/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo ] 311137/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo ] 388929/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOo ] 449601/492712
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 492712/492712
Files: 1
Directories: 0
Touched Pages: 492712 (1G)
Elapsed: 17.382 seconds
Let’s notice that we’ve obtained the stills of the progress bar just by repeatedly pressing Enter.
Finally, let’s evict the file from the cache with the e option:
$ vmtouch -ev test_file
Evicting test_file
Files: 1
Directories: 0
Evicted Pages: 1970848 (7G)
Elapsed: 0.53841 seconds
In addition, let’s highlight that we can remove any file, not only the one cached by this command.
6. Locking File in Memory
Now let’s force the file to stay in the memory. So, we need to use the l option to lock the file:
$ vmtouch -vtl test_file
test_file
[OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO] 492712/492712
LOCKED 492712 pages (1G)
Let’s notice that now the command keeps running and blocks the terminal. Thus, let’s open another one and check the file:
$ vmtouch test_file
Files: 1
Directories: 0
Resident Pages: 492712/492712 1G/1G 100%
Elapsed: 0.040164 seconds
Now, we can’t drop the file from memory until we stop the program by pressing Ctrl-C in the terminal. Thus, we can’t either evict it with vmtouch or by clearing the cache system-wide.
To facilitate unlocking the file, we can use the P option with a file name that is going to contain the PID of the corresponding process:
$ vmtouch -tl -P PID_vmtouch test_file &
After the file is cached, we can kill the process end evict the file:
$ kill '$(cat PID_vmtouch)'
$ vmtouch -e test_file
6.1. Deamonizing vmtouch
As an addition to the locking feature, we can move vmtouch in the background to act as a daemon. In this way, we’re going to keep a file or files in memory for further use. So, let’s use the d option:
$ vmtouch -tld -P PID_vmtouch test_file
Of course, we need the process ID to stop the daemon eventually.
7. Crawling Through Directories
With vmtouch, we can recursively travel the directories and perform tasks on encountered files. So, let’s check if the files of the commands manuals are cached:
$ sudo bash -c "vmtouch -f /usr/share/man"
# some warnings
Files: 7843
Directories: 121
Resident Pages: 11413/11413 44M/44M 100%
Elapsed: 0.074808 seconds
Let’s notice the f option to follow symbolic links. So we’ve found out that all manuals are in the memory now. Next, let’s drop them:
$ sudo bash -c "vmtouch -fe /usr/share/man"
Then, let’s check again:
$ sudo bash -c "vmtouch -f /usr/share/man"
# ...
Files: 7843
Directories: 121
Resident Pages: 0/11413 0/44M 0%
Elapsed: 0.074969 seconds
Now let’s ask for some manuals, e.g., man sudo and man less, and then do check one more time:
$ sudo bash -c "vmtouch -fv /usr/share/man"
# ...
/usr/share/man/man1/less.1.gz
[OOOOOO] 6/6
[OOOOOO] 6/6
# ...
/usr/share/man/man8/sudoedit.8.gz
[OOO] 3/3
[OOO] 3/3
# ...
Files: 7843
Directories: 121
Resident Pages: 18/11413 72K/44M 0.158%
So, now the kernel is using 6 pages for the less manual, 3 pages for sudo, and 9 for other files.
8. The fincore Command
Now let’s use the fincore command from the util-linux package to find how much of the file is located in the memory. Now, let’s check all files in the current folder:
$ fincore *
RES PAGES SIZE FILE
0B 0 1,9G test_file
840K 210 1,9G test_file1
1,3G 342232 1,9G test_file2
1,9G 492712 1,9G test_file3
840K 210 1,9G test_file4
2,7M 698 1,9G test_file5
So, we prepared a set of identical files in the current folder. Then, we used a mix of tail and cat commands to trick the kernel to cache files.
Next, let’s learn by fincore –help the meaning of columns. RES reports the cached amount of the file in bytes, while PAGES do the same but in the number of pages. Finally, SIZE is the file’s disk size, and FILE is its name.
Unlike vmtouch, fincore only reports cache usage per file.
9. Conclusion
In this tutorial, we learned commands to monitor and manage loading files in the disk cache. First, we checked how much of a particular file is already cached. Then, we dealt with manually placing and removing files from memory. Next, we locked cached files, making any further access to them faster.
Finally, let’s emphasize the kernel handles file caching well enough for all standard use. So, we should resort to manual control only for special developing or debugging purposes.