1. Overview
The problem of removing the last character from each line in a file looks pretty simple. However, in practice, we may encounter some variants of this requirement.
In this tutorial, we’ll address how to solve this problem through examples.
Further, we’re going to discuss some common variants.
2. The Example File
To explain different commands clearly, let’s create an example file with a few lines:
$ cat input.txt
This is a normal line.
This line has 3 trailing spaces.
The next line has only 4 spaces:
The next line is an empty line:
I am the last line.
Our input.txt has several lines of text.
Additionally, some lines contain trailing spaces, and the trailing spaces are significant to this problem. However, this information is not so obvious in the output above.
We can pass the -e option to the cat command and ask it to print a ‘*$*‘ sign at the end of each line:
$ cat -e input.txt
This is a normal line.$
This line has 3 trailing spaces. $
The next line has only 4 spaces:$
$
The next line is an empty line:$
$
I am the last line.$
Now, we can see the trailing spaces in the output clearly.
Next, let’s take a closer look at the “remove the last character from each line” problem and its variants.
3. Removing the Last Character From Each Line
First, let’s have a look at how to remove the last character from each line regardless of whether it is a space or not.
There are many ways to solve this problem. Now, let’s look at some common solutions.
3.1. Using Pure Bash
Two Bash parameter expansion techniques can help us to remove the last character from a variable:
- Substring expansion – ${VAR:offset:length}
- Removing matching suffix pattern – ${VAR%word}
Next, let’s take a closer look at both approaches.
If we give a negative length in the substring expansion, Bash will count the length from the end of the string towards the offset. Therefore, we can pass -1 as the length to remove the last character from the variable:
$ var="012345"
$ echo ${var:0:-1}
01234
Also, Bash allows us to omit the offset if it’s ‘0‘: ${var::-1}
However, we should keep in mind that Bash’s substring expansion with the negative length won’t work for empty strings:
$ var=""
$ echo ${var::-1}
bash: -1: substring expression < 0
Therefore, we need to check if the variable is empty, for example, using [ -z “$var” ], before extracting the substring.
Next, let’s see how the ${VAR%word} expansion can help us to chop the last character from a variable.
In Bash, the pattern ‘*?*‘ matches any single character. Since we want to remove the last character, we can use the ‘*?’* pattern as the suffix:
$ var="012345"
$ echo ${var%?}
01234
And removing suffix expansion works with empty strings as well:
$ var=""
$ echo ${var%?} | cat -e
$
We’ll take the suffix expansion for solving our problem as our file contains empty lines.
Well, so far, we’ve solved the core part of the problem. The rest is just looping through the lines and printing the output:
$ while IFS="" read var; do echo "${var%?}"; done <input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces. $
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
We piped the result to “cat -e” to display the trailing spaces clearly. As the output above shows, the last character of each line has been removed, regardless of whether it’s a space or not.
3.2. Using the sed or awk Command
The pure Bash solution doesn’t require other software dependencies. However, we have to handle every aspect on our own, such as how to loop through the file, how to set the IFS variable, and so on.
Nowadays, some handy text processing utilities have been pre-installed by default on most modern Linux distros — for example, sed and awk.
We can solve the problem a lot easier by using those powerful utilities.
Next, let’s have a look at how to solve the problem using sed:
$ sed 's/.$//' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces. $
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
The sed command above uses regex substitution to remove the last character of each line. Comparing to the Bash version, the sed solution looks more compact.
Similarly, awk can also solve the problem in a short form:
$ awk '{sub(/.$/,"")}1' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces. $
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
We’ve solved the problem of removing the last character from each line, no matter if it is a space.
However, in practice, we often want to remove the last non-whitespace character from each line of the file.
Next, let’s look into some variants of the original problem.
4. Removing the Last Non-Whitespace Character From Each Line
First, let’s imagine, if a line has trailing spaces, we could have a couple of different requirements:
- Removing the last non-whitespace character together with trailing spaces: “example@ ” -> “example“
- Removing the last non-whitespace character only and preserve the trailing spaces: “example@ ” -> “example “
Next, let’s solve these two variant requirements one by one.
4.1. Removing the Last Non-Whitespace Character Together With Trailing Whitespaces
One idea to solve this problem is to build a regular expression (regex) matching the last non-whitespace character followed by zero or more whitespace characters.
Then we can replace this pattern with an empty string.
The regex pattern is not hard to build. The ERE patterns “\S” and “\s” match a single non-whitespace character and whitespace character, respectively. They are exactly what we’re looking for.
Also, both sed and awk support ERE. First, let’s see how sed solves it:
$ sed -r 's/\S\s*$//' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces$
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
We pass the -r option to GNU sed to tell it that we use ERE in the script.
As we see in the output above, the last non-whitespace character and all trailing spaces have been removed.
Furthermore, the line only containing four spaces, and the empty line, remain unchanged as they don’t contain any non-whitespace character.
Similarly, we can get the same output with the awk command:
$ awk '{sub(/\S\s*$/,"")}1' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces$
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
So far, we’ve solved the problem.
Next, let’s see how to preserve the trailing whitespaces.
4.2. Removing the Last Non-Whitespace Character but Keeping Trailing Whitespaces
We can still solve this problem using regex substitution. But, first, let’s see the sed solution:
$ sed -r 's/\S(\s*)$/\1/' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces $
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
This time, *we put the trailing whitespace “\s*” in a capturing group*.
Later, we don’t replace the pattern with an empty string in the substitution as we did previously. Instead, we reference the capturing group in the replacement to retain the trailing whitespaces.
As the output above shows, those periods and colons have been removed. However, we’ve preserved the trailing whitespace.
Thus, we have solved the problem using the sed command.
GNU awk‘s nice gensub function allows us to handle backreferences as well.
Finally, let’s see the awk solution to the problem:
$ awk '{ $0=gensub(/\S(\s*)$/,"\\1","g") } 1' input.txt | cat -e
This is a normal line$
This line has 3 trailing spaces $
The next line has only 4 spaces$
$
The next line is an empty line$
$
I am the last line$
When we use the gensub function, we should keep in mind that, unlike the sub and gsub functions, gensub returns the result as a new string.
Therefore, we need to assign the result to a variable.
Apart from that, when we want to reference a capturing group in the gensub function, we must escape the index, for instance, “\\1” for group 1.
5. Conclusion
In this article, we’ve analyzed the problem of removing the last character from all lines in a file.
Moreover, we’ve also discussed several variants. As usual, we addressed the solutions through examples.