1. Overview
When we search a pattern in inputs, grep might be the first command that comes up. By default, the grep command can print matched lines. Further, it allows us to print additional context lines before or after the match.
In this tutorial, we’ll discuss how to only print the n-th line after the match.
2. Introduction to the Problem
Let’s understand the problem by an example. First of all, let’s see an input file:
$ cat report.txt
Performance: BAD
- app name: weather-cache
- time: 2021-12-18 21:20:20
- resource usage: CPU: 3%, RAM: 4096MB
- description: RAM usage is too high!
- ... omitted many other lines ...
Performance: OK
- app name: database-import
- time: 2021-12-19 20:20:20
- CPU Usage: 1%, RAM Usage: 200MB
- description: everything runs well!
Performance: BAD
- app name: weather-web
- time: 2021-12-19 21:20:20
- resource usage: CPU: 100%, RAM: 300MB
- description: CPU usage is too high!
- ... omitted many other lines ...
Let’s say our monitoring system periodically scans all running applications and creates reports. The report.txt is one of the generated reports.
As we can see in the input file, we have “Performance: OK” and “Performance: BAD” blocks.
Usually, we only need to take a closer look at those “Performance: BAD” blocks. Therefore, the string “Performance: BAD” will be our search pattern.
However, we don’t want to read entire blocks. For example, sometimes, we only need to know the name of the applications with a performance issue, which is the first line after a “Performance: BAD” line.
But sometimes, we’re not interested in which application has a performance problem, either. Instead, we would like to know the “resource usage” status, which is the third line after line “Performance: BAD“.
Therefore, we can see it’s a “Printing only the n-th line after each match” problem in general.
In this tutorial, we’ll start with “Printing only the next line after each match” as this requirement comes pretty often in practice. After that, we’ll extend it to the more general case: “the n-th line after each match“.
Moreover, we assume the number of lines between two matched lines is always greater than n.
Next, let’s first take a look at how to print only the following line after each match.
3. Printing Only the Next Line After Each Match
There are various ways to get only the next line after each match. In this section, we’ll address three straightforward methods: using grep, sed, and awk.
Next, let’s see them in action.
3.1. Using the grep Command
If we use the option ‘-A1‘, grep will output the matched line and the line after it. Now, we need to suppress the matched line.
To do that, we can pipe the ‘grep -A1‘ search result to another grep command with the -v option to invert the search:
$ grep 'Performance: BAD' --no-group-separator -A1 report.txt | grep -v 'Performance: BAD'
- app name: weather-cache
- app name: weather-web
As we’ve seen in the output above, we’ve got the expected result.
We’ve used the option –no-group-separator to suppress the separator between groups of lines, which is “—” by default.
3.2. Using the sed Command
A compact sed one-liner can solve the problem. First, let’s look at the command:
$ sed -n '/Performance: BAD/{ n; p }' report.txt
- app name: weather-cache
- app name: weather-web
The output above shows the sed one-liner does the job.
Let’s walk through the command quickly to understand how it works:
- -n – Tell sed not to print pattern space automatically unless we ask it to print
- /pattern/{…} – Execute the commands in {…} when an input line matches the pattern
- {n; p} – When a line matches the pattern, first overwrite pattern space by the next input line (n command), then print pattern space (p command)
Simply put, the idea is: Once a matched line is found, we read the next line and print.
3.3. Using the awk Command
Of course, We can implement the same idea of the sed solution with the awk command.
getline is a multi-functional statement in awk. If we write a plain “getline“, it works pretty similar to sed‘s ‘n‘ command – reading the next input line:
$ awk '/Performance: BAD/{ getline; print }' report.txt
- app name: weather-cache
- app name: weather-web
So, the awk command above works as well.
Next, let’s extend the problem and see how to print only the n-th line after each match.
4. Printing Only the N-th Line After Each Match
We’ll still use the grep, sed, and awk commands to solve the problem.
4.1. Using the grep Command
We’ve solved the n=1 case earlier using the grep command:
grep 'pattern' --no-group-separator -A1 input | grep -v 'pattern'
We may think that changing -A1 to -An would solve the “n-th” case. However, if n>1, this idea won’t work anymore. This is because grep -An will output n+1 lines: the matched line + n lines after it.
The matched line will be suppressed if we pipe this result to grep -v ‘pattern’. But we’ll have n lines instead of the n-th line after the match.
To get the n-th line after each match, we can first use grep -An to find each block with n+1 lines. Next, instead of piping it to grep -v, we pipe it to a command that can print every (n+1)-th line.
For example, we can use the awk command to do that easily:
$ grep 'Performance: BAD' --no-group-separator -A3 report.txt | awk 'NR % 4 == 0'
- resource usage: CPU: 3%, RAM: 4096MB
- resource usage: CPU: 100%, RAM: 300MB
As the output above shows, we’ve got the 3rd line after each “Performance: BAD” line.
Also, we’ve used awk to post-process grep‘s result. Actually, the powerful awk alone is sufficient to solve this problem. We’ll see it in later sections.
4.2. Using the sed Command
First, let’s revisit the sed one-liner that we’ve solved the “print the next line after each match” problem:
sed -n '/pattern/{ n; p }' input
The core part in the command above is: when a line matches the pattern, getting the next input line (n) and print (p).
Therefore, if we want to get the third line after each match, we can just add two more ‘n‘s to the one-liner:
$ sed -n '/Performance: BAD/{ n; n; n; p }' report.txt
- resource usage: CPU: 3%, RAM: 4096MB
- resource usage: CPU: 100%, RAM: 300MB
As we can see in the output above, the problem has been solved.
However, it’s worth mentioning that, since the sed script doesn’t support variables or other script language features, such as loop, if-else, and so on, it’s not easy to build a generic solution to the problem.
For example, if the requirement says “printing the 15th line after each match”, we have to type 15 ‘n‘s. However, to make the sed command dynamic, we can build the sed command by other scripts, such as a shell script.
4.3. Using the awk Command With getline
Compared to sed, awk script supports a rich set of functions and C-like flow controls.
Therefore, we can easily extend the previous awk command by wrapping the getline statement in a for loop to output the third line after each match:
$ awk -v n=3 '/Performance: BAD/ { for (i = 1; i <= n; i++) getline; print }' report.txt
- resource usage: CPU: 3%, RAM: 4096MB
- resource usage: CPU: 100%, RAM: 300MB
We can see the awk command is easy to understand and does the job.
As we’ve discussed, getline is a multi-functional statement. However, it’s suggested that getline is best avoided by default, particularly for beginners.
Therefore, let’s see another awk approach that solves this problem without using getline.
4.4. Using the awk Command Without getline
awk allows us to declare variables. Therefore, we can store the line number of a match in a variable, say mLine, and check if the current line number equals (mLine + n). If it is equal, then we’ll print the line:
$ awk -v n=3 '/Performance: BAD/ { mLine = NR } mLine && NR == mLine + n' report.txt
- resource usage: CPU: 3%, RAM: 4096MB
- resource usage: CPU: 100%, RAM: 300MB
5. Conclusion
Today, we’ve begun with solving the problem, “Printing only the next line of each match”.
Further, we’ve extended the solution to a generic one to solve the problem, “Printing only the n-th line of each match”.
In this article, we’ve addressed three approaches to solve the problem: using grep, sed, and awk.