1. Overview
When working on the Linux command line, we often use the grep command to search text in files using certain regex patterns. It’s straightforward and efficient.
In this quick tutorial, we’ll explore how to list only the names of the files that match the given pattern.
2. The Example: myApp Log Files
As usual, we’ll address the problem through an example. Let’s say we have an application called myApp. It has some log files:
$ tree -f myApp
myApp
└── myApp/logs
├── myApp/logs/archives
│ ├── myApp/logs/archives/app.log.2022-09-25
│ └── myApp/logs/archives/app.log.2022-09-26
├── myApp/logs/myApp.log
└── myApp/logs/security.log
2 directories, 4 files
The tree output above shows that the log files and archives are stored in different directories. So next, let’s quickly peek into the content of those files:
$ head logs/**/*.*
==> logs/archives/app.log.2022-09-25 <==
2022-09-26 10:31:00 [INFO] Application starts successfully
2022-09-26 10:47:00 [WARN] CPU usage is above 95%
2022-09-26 10:31:00 [INFO] CPU usage is back to 35%
==> logs/archives/app.log.2022-09-26 <==
2022-09-26 10:00:00 [ERROR] application cannot start! Cause: No DNS configurated
2022-09-26 10:31:00 [INFO] Application starts successfully
2022-09-26 10:47:00 [ERROR] Network is broken
==> logs/myApp.log <==
2022-09-27 10:00:00 [INFO] application starts successfully
2022-09-27 10:01:00 [WARN] Cached value is out-dated, refreshing ...
2022-09-27 10:07:00 [ERROR] Cannot access the database.. retry scheduled ...
2022-09-27 10:07:20 [INFO] new database connection established, restart broken transactions
2022-09-27 10:08:00 [ERROR] Cannot access the database.. retry scheduled ...
2022-09-27 10:10:00 [ERROR] Cannot access the database.. retry(2) scheduled ...
2022-09-27 10:11:00 [INFO] new database connection established, restart broken transactions
==> logs/security.log <==
2022-09-27 11:00:00 [INFO] 42 new users regsitered in the last 5 minutes
2022-09-27 11:01:00 [WARN] Login failed 10 times in 1 min. User: [email protected]
2022-09-27 11:08:00 [INFO] new admin user is created: [email protected]
As we can see, as regular log files, those files contain log entries with different log levels.
Next, let’s see how to search text through those files using grep.
3. Printing Matched Lines vs. Printing Matched Filenames Only
Usually, we need to pay attention to the log entries with the ERROR level. So, let’s first search all [ERROR] level log records among all the log files and archived logs.
First, as our target files are located in different directories, we’ll use the -r option to do a recursive search.
Further, as ‘*[‘ and ‘]*‘ have special meaning in regex, we should either escape them or use the -F option to tell grep to do a fix-string search:
$ grep -rF '[ERROR]' myApp/logs
myApp/logs/myApp.log:2022-09-27 10:07:00 [ERROR] Cannot access the database.. retry scheduled ...
myApp/logs/myApp.log:2022-09-27 10:08:00 [ERROR] Cannot access the database.. retry scheduled ...
myApp/logs/myApp.log:2022-09-27 10:10:00 [ERROR] Cannot access the database.. retry(2) scheduled ...
myApp/logs/archives/app.log.2022-09-26:2022-09-26 10:00:00 [ERROR] application cannot start! Cause: No DNS configurated
myApp/logs/archives/app.log.2022-09-26:2022-09-26 10:47:00 [ERROR] Network is broken
As we can see, grep outputs all matched lines together with the filenames. Also, if a file contains multiple “*[ERROR]*” records, its filename will be printed multiple times, such as myApp/logs/myApp.log in the example above. This is pretty helpful if we want to check the entry details.
However, sometimes, we only want to know which log files contain “*[ERROR]” entries but don’t care about the contents. In this case, we want to have an output with unique names of the files whose content matches the given pattern, which is “[ERROR]*” in this example.
To achieve that, we can pass an additional option -l, to tell grep to suppress normal output and print matched filenames only:
$ grep -rFl '[ERROR]' myApp/logs
myApp/logs/myApp.log
myApp/logs/archives/app.log.2022-09-26
So this time, we have only two matched filenames.
It’s worth mentioning that the -L (uppercase L) option will do the opposite. In other words, the -L option tells grep to list the names of the files that don’t match the pattern.
Finally, let’s list all files that don’t contain any ‘*[ERROR]*‘ entry:
$ grep -rFL '[ERROR]' myApp/logs
myApp/logs/security.log
myApp/logs/archives/app.log.2022-09-25
4. Conclusion
In this short article, we’ve learned how to control grep‘s output using -l and -L options through examples.