1. Overview
In Linux, we can determine if a command has been successfully executed by checking its return value. Usually, if it returns a non-zero number, we think the command failed.
Sometimes, when we do text substitutions with sed or awk, we would like to know if we’ve done successful substitutions there through the return codes.
In this tutorial, we’ll learn how to return different values depending on the substitution result.
2. Introduction to the Problem
First of all, let’s see an example:
$ sed 's/Java/Kotlin/' <<< 'Java is awesome!'
Kotlin is awesome!
$ echo "return code is $?"
return code = 0
As the example above shows, we’ve successfully used the sed command to substitute text on a single-line string. Also, unsurprisingly, we’ve gotten zero as the return value.
Next, let’s see another example:
$ sed 's/Windows/Linux/' <<< 'Java is awesome!'
Java is awesome!
$ echo "return code is $?"
return code = 0
This time, we’ve tried to replace “Windows” with “Linux” using the sed command. And, because the input string doesn’t contain the pattern we’re trying to replace (“Windows“), the command doesn’t have to perform any substitutions, so our output string from sed is the same as the input.
But, as we’ll note from the echo output in our example, the return value of the sed command was still zero. This is because, *although the substitution (s/…/…/) didn’t change anything, the sed command’s execution itself was successful*.
However, sometimes, after we’ve executed a command involving substitution – for instance, sed or awk – we would like to know if the substitution was successful by quickly checking the return code.
Therefore, we’ll address how to let sed and awk return a non-zero value, say 100, to indicate an unsuccessful substitution when the input doesn’t match the search pattern.
We’ll cover two scenarios: single-line input and multi-line input.
Also, we’ll use GNU awk and GNU sed in the tutorial as they’re widely used in Linux operating systems.
3. Single-Line Input
When we write shell scripts, we often change the value of a variable by “search and replace” on a single-line string.
3.1. Using the sed Command
The sed command has a ‘q‘ command to quit the sed script and stop processing further inputs. The GNU sed supports an [exit-code] argument to the ‘q‘ command.
Therefore, *we can customize the return value by calling the ‘q [exit-code]‘ command*, for example:
$ sed 'q 100' <<< 'Java is awesome!'
Java is awesome!
$ echo "return code is $?"
return code is 100
Next, let’s check that when the given Regex matches the pattern space, we’ll execute the substitution and return exit code 0, otherwise, we quit the sed processing with exit code 100:
$ sed '/Java/!{q100}; {s/Java/Kotlin/}' <<< 'Java is awesome!'
Kotlin is awesome!
$ echo "return code is $?"
return code is 0
$ sed '/Windows/!{q100}; {s/Windows/Linux/}' <<< 'Java is awesome!'
Java is awesome!
$ echo "return code is $?"
return code is 100
We’ve solved the problem, as we can see in the output above.
3.2. Improving the sed Command
First, let’s revisit our sed solution.
We’ve repeated the Regex twice in the command. One is for the match check, and the other is for the s/../../ command.
Including this command in a shell script can bring some extra effort to further maintenance. Once we need to adjust the Regex, we must do the exact change twice.
A straightforward improvement would be saving the Regex in a variable. Then, we can execute the sed substitution with variables.
Actually, we can use sed‘s ‘t‘ command to remove the Regex duplicate without the extra variable:
$ sed 's/Java/Kotlin/; ta; q 100; :a' <<< 'Java is awesome!'
Kotlin is awesome!
$ echo "return code is $?"
return code is 0
$ sed 's/Windows/Linux/; ta; q 100; :a' <<< 'Java is awesome!'
Java is awesome!
$ echo "return code is $?"
return code is 100
*sed‘s ‘*ta’ command can branch the processing to a label (‘a‘ in this case) if the previous s/../../ has done a successful substitution**.
In our example above, if the substitution is successful, sed jumps to the label ‘*:a*‘, which is the end of the sed command. Thus, the command exits with zero.
However, if s/../../ isn’t successful, the ‘ta‘ command does nothing. Therefore, ‘q 100‘ will be executed, causing sed to exit with code 100.
3.3. Using the awk Command
Like sed‘s ‘q [exit-code]‘, awk has the exit [expression] command to exit the awk script with a return value, such as ‘exit 100‘.
Also, awk has sub() and gsub() functions for text substitutions. Further, the functions will return the number of successful substitutions.
So, we can quickly determine if the substitution is successful by checking the sub() or gsub() function’s return value:
$ awk '{ v=sub(/Java/, "Kotlin"); print } v==0 { exit 100 }' <<< 'Java is awesome!'
Kotlin is awesome!
$ echo "return code is $?"
return code is 0
$ awk '{ v=sub(/Windows/, "Linux"); print } v==0 { exit 100 }' <<< 'Java is awesome!'
Java is awesome!
$ echo "return code is $?"
return code is 100
As we can see, the awk solution is more straightforward than the sed solution.
4. Multi-Line Input
We often apply substitutions on multi-line input as well. Let’s create an example file:
$ cat input.txt
Java is a programming language.
Java is awesome!
Hello World.
Next, let’s see how to control the return value depending on the substitution result.
This time, let’s solve the problem using awk first.
4.1. Using the awk Command
We’ve learned that awk‘s sub() and gsub() functions return the number of successful substitutions.
When we process multi-line input, we can accumulate the substitution functions’ return values on each input line to determine if there is at least one successful substitution:
$ awk '{ v += sub(/Java/, "Kotlin"); print } END{ if(v==0) exit 100 }' input.txt
Kotlin is a programming language.
Kotlin is awesome!
Hello World.
$ echo "return code is $?"
return code is 0
$ awk '{ v += sub(/Windows/, "Linux"); print } END{ if(v==0) exit 100 }' input.txt
Java is a programming language.
Java is awesome!
Hello World.
$ echo "return code is $?"
return code is 100
The awk command above is pretty similar to the single-line solution. There are only two changes:
- Using v += sub() instead of v = sub() to accumulate the return values on each line
- Moving the “v==0” check to the END block to determine the exit code after processing all input lines
As we’ve seen, the awk command is pretty easy to understand. Therefore, we address the awk solution first.
sed can solve the problem, too. Let’s see how it’s done.
4.2. Using the sed Command
sed doesn’t support declaring variables and performing math calculations. Therefore, we have to solve the problem differently.
Let’s first take a look at the solution:
$ sed -n 's/Java/Kotlin/; tOK; bNOK; :OK;H; :NOK;p;${g;/./!{q100}}' input.txt
Kotlin is a programming language.
Kotlin is awesome!
Hello World.
$ echo "return code is $?"
return code is 0
$ sed -n 's/Windows/Linux/; tOK; bNOK; :OK;H; :NOK;p;${g;/./!{q100}}' input.txt
Java is a programming language.
Java is awesome!
Hello World.
$ echo "return code is $?"
return code is 100
Before we walk through the command, we need to understand a few sed commands:
- b label – Branch to the given label without checking if the previous s/../../ is a success
- H – Append the pattern space to the hold space
- g – Copy the hold space to the pattern space
Next, let’s figure out how the command works.
4.3. Understanding the sed Solution
A “flow-chart” may help us understand it more easily:
┌──────────────────────┐
┌───success─:OK──────>┤H; p; ${g;/./!{q100}} │
│ ^ └──────────────────────┘
│ │
sed -n 's/Java/Kotlin/; tOK; bNOK; :OK;H; :NOK; p;${g;/./!{q100}}' input.txt
│ │
│ v ┌───────────────────┐
└─not success─┴:NOK──>┤ p;${g;/./!{q100}} │
└───────────────────┘
As the chart above shows, there are two paths after the substitution. Let’s take a look at the success case first.
The ‘*tOK;‘ command will route the processing to execute ‘H; p; ${g;/./!{q100}}*‘:
- H – append current pattern space to hold space, so the hold space is not empty anymore. We use this trick to mark if there is at least one successful substitution
- p – print the pattern space
- ${.. – if the current line is the last line from the input
- g – copy the hold space to the current pattern space so that we can check if there is at least one successful substitution
- .. /./!{q100} } – if the hold space is empty, we think no successful substitution has been done, so return: q100
The ‘NOK‘ path is similar to the ‘OK‘ path. The ‘*bNOK;‘ command will branch to the commands: ‘p; ${g;/./!{q100}}*‘.
So, we’ll do almost the same as the ‘OK‘ path, but we don’t append pattern space to hold space. This is because we only set the mark in the hold space after a successful substation.
If we compare the sed and awk solutions, the awk solution is much easier to understand.
Therefore, when the input is multi-line, we recommend using the awk solution to control the return value.
5. Conclusion
In this article, we’ve discussed how to return a non-zero value after awk and sed substitutions are unsuccessful.
Also, we’ve covered single-line and multi-line input scenarios.