1. Overview
Sometimes, we might need to quickly find a file or set of files by name or a specific extension. While most graphical file managers do have a file search feature, it’s not as flexible and powerful as the search tools that are available on the command line.
In this article, we’ll discuss a few methods we can use to find files by name or extensions inside the Linux terminal. We’ll be using the built-in core search utilities that ship with the various distributions.
2. Using find
The find command lets us search for files and directories on our drives. It’s a very comprehensive utility that can do a lot. Not only can we search for files, but we can also carry out operations on the matched files and directories. It’s available on most Linux distributions. Therefore, it’s usually the go-to tool when it comes to searching.
Let’s verify the availability of find:
$ find --version
find (GNU findutils) 4.8.0
Packaged by Gentoo (4.8.0)
Copyright (C) 2021 Free Software Foundation, Inc.
The find utility ships with the findutils package. If it’s not available on the system, we can install the findutils package from the official repository using a package manager like yum or apt.
2.1. Basic File Searching
The basic syntax for find is straightforward:
$ find [PATH] [OPTIONS] [EXPR]
By default, the path is the current directory. When we run the find command without any options, it will list all the files and directories in the current directory. Let’s suppose we want to search for the .zshrc file in the current directory. We’ll specify it using the -iname test:
$ find -iname ".zshrc"
./config/zsh/.zshrc
The -iname test turns off case sensitivity, as opposed to the -name test, which retains case sensitivity. Sometimes, we might want to search for files based on a pattern. For that reason, we can use the -iregex or –regex test:
$ find -regex ".*\(zsh\|bash\)rc"
./.config/zsh/zshrc
./.config/bash/bashrc
The –regex test specifies a pattern that is used to match all files that contain zsh in its filename. Alternatively, we can use the -iregex test to disable case sensitivity. In the above snippet, the .* at the beginning of the regex is used to print the path of the file. However, we can if we want to search for files by extension, we can simply use -name or -iname:
$ find -type f -iname "*.json"
./.mozilla/firefox/esr/shield-preference-experiments.json
./.mozilla/firefox/esr/extension-preferences.json
./.mozilla/firefox/esr/times.json
The -type test is used to filter the type of the match. In our case, we only need to match files.
2.2. Finding Based on a Set of Extensions
Sometimes, we might want to find files that match a set of extensions. For instance, searching for files that are either PDF, EPUB, or Markdown. Fortunately, we can specify multiple expressions to the find command:
$ find -type f -name "*.pdf" -or -name "*.epub" -or -name "*.md"
./github/haidarz/README.md
./dox/freebsd_handbook.epub
./dox/annotated_cpp.pdf
./.config/zsh/plugins/fsh/README.md
./.config/zsh/plugins/fsh/CHANGELOG.md
The -or is an operator that chains multiple expressions together. There are other operators that can be used in the same way, such as -not and -and.
Alternatively, we can write the above command using a regular expression:
$ find -iregex ".*.\(pdf\|epub\|md\)"
./github/haidarz/README.md
./dox/freebsd_handbook.epub
./dox/annotated_cpp.pdf
./.config/zsh/plugins/fsh/README.md
./.config/zsh/plugins/fsh/CHANGELOG.md
The “pipe” character in the regex acts as an -or operator.
3. Using locate
The locate command lists files that are available in the databases generated by updatedb. The updatedb program indexes the files and directories in the system and saves them to a database file. The updatedb program is run by cron at least once a day, though we can also run it manually from the command line.
The locate utility is performant as compared to find because it doesn’t need to walk through the file hierarchy to find files. For that reason, it’s the preferred option if we want to achieve optimal execution time.
If it’s not available on the system, we can install either the findutils or the mlocate package from the official repository. Mind that we’ll need to run the updatedb command once locate is installed. Let’s verify the availability of locate:
$ locate --version
mlocate 0.26
Copyright (C) 2007 Red Hat, Inc. All rights reserved.
This software is distributed under the GPL v.2.
3.1. Basic Usage
As we know, locate searches the entire database. Therefore, it’s expected that locate will search the entire system. For that reason, we cannot give it the desired path to search in. However, it does provide a pattern matching option.
First of all, let’s update the files database using the updatedb command:
$ updatedb
Now, let’s search for files or paths with the exact name of xwayland:
$ locate xwayland -l 3
/var/db/repos/gentoo/x11-base/xwayland
/var/db/repos/gentoo/x11-base/xwayland/Manifest
/var/db/repos/gentoo/x11-base/xwayland/files
As we can see, it lists the files and directories that have the name xwayland. The -l or –limit option limits the number of paths to match.
3.2. Regular Expressions
Let’s change the previous example a bit, so it only lists the files and directories instead of paths that contain xwayland:
$ locate -r "/xwayland$"
/var/db/repos/gentoo/x11-base/xwayland
The -r or –regex option is used to specify a regular expression. In our case, we want to list the paths that end with xwayland, hence the dollar sign ($) at the end of the expression. We can specify any POSIX compliant regex to locate:
$ locate --regex "\.(pdf|md|epub)$" -l 5
/usr/share/alsa/ucm/README.md
/usr/share/alsa/ucm2/README.md
/usr/share/bison/README.md
/usr/share/doc/argon2-20190702/argon2-specs.pdf
/usr/share/doc/bzip2-1.0.8-r1/manual.pdf
Mind that we used –regex instead of -r. The difference between both options is that the -r is used to specify a Basic Regular Expression (BRE), while the –regex flag is used to specify Extended Regular Expression (ERE). The Extended Regular Expression gives us more control because it’s more comprehensive. For instance, the piping feature is achieved with ERE instead of BRE.
3.3. Missing Files
As we saw, the locate command does a pretty good job. However, we should note that, by default, it doesn’t care about the existence of files. If we were to delete a file from the system, we should be aware that the file will still be registered in the files database. Fortunately, locate has the –existing or -e flag that takes care of it:
$ locate wgetrc
/home/hey/.config/wget/wgetrc
$ rm -rf /home/hey/.config/wget/wgetrc
$ locate wgetrc
/home/hey/.config/wget/wgetrc
$ locate -e wgetrc
$
As we can see, even after deleting the wgetrc file, it still displays the wgetrc file in the result until we specify the -e flag. Additionally, it resolves when the files database is updated.
4. Conclusion
In this article, we saw how we can search for files and directories using the find and locate utilities. We learned how to use regular expressions to effectively find files and directories based on matching patterns.