1. Overview
A common task when working on the Linux command-line is searching for a string or a pattern and then replacing or deleting it. However, there are special characters that can cause this common task to be less trivial than we anticipate.
In this tutorial, we’ll explore several approaches to remove newline characters using tools such as tr, awk, Perl, paste, sed, Bash, and the Vim editor.
2. Preparing Our Example File
Before we start, let’s create a text file named some_names.txt that we’ll use to apply all our strategies:
$ cat > some_names.txt << _eof_
Martha,
Charlotte,
Diego,
William,
_eof_
The goal is to end up with a CSV-like file with the content:
Martha,Charlotte,Diego,William,
3. Using tr
To delete or replace some characters by specific others, we think of tr because it’s easy to use.
The command tr uses the standard input (stdin), performs some operations (translate, squeeze, delete), and then copies the result to the standard output (stdout).
We’ll now focus on the “delete” operation. With the parameter -d, we define a set of characters that we want tr to remove.
Since we just want to delete the newlines, we place only this character in the set and then redirect the standard output to a new CSV file:
$ tr -d "\n" < some_names.txt > some_names.csv
Now, let’s see the content of our CSV file:
$ cat some_names.txt
Martha,Charlotte,Diego,William,
4. Using awk
The awk program is a well-known, powerful, and useful tool that allows us to process text using patterns and actions.
It lets us perform some operations in a very straightforward way, with the help of some tricks:
$ awk 1 ORS='' some_names.txt > some_names.csv
Let’s see the content of our CSV file:
$ cat some_names.csv
Martha,Charlotte,Diego,William,
Let’s take a closer look to understand how we solved the problem.
We wrote the pattern “1” because it evaluates to true (allowing the record to be processed), then, with the absence of action, awk makes the default action, which is to print the entire record terminated with the value of the ORS variable.
Then we define the ORS (Output Record Separator) variable, which is set to newline by default, to be the empty string.
Following these two steps, we consumed every record, then printed them using the empty string as the output record separator. In other words, we simply ignored the newline.
Another way is to use it as an awk program text:
$ awk 'ORS="";1' some_names.txt
And an extended version of that would be**:**
$ awk 'BEGIN{ ORS="" } { print $0 }' some_names.txt
Here, we do the same, but this time, we use the BEGIN pattern, which executes the action of defining the ORS variable before any of the input is read, and then, printing the $0 variable, which contains the whole record (usually an entire line of the input).
5. Using Perl
Perl is a language that has a great set of features for text processing.
We’ll use the Perl interpreter in a sed-like way:
$ perl -pe 's/\n//' some_names.txt > some_names.csv
Let’s take a look at how this command works:
- -p tells Perl to assume the following loop around our program
- -e tells Perl to use the next string as a one-line script
- ‘s/\n//’ is the script that instructs to Perl to remove the \n character
And now, let’s review our CSV file:
$ cat some_names.csv
Martha,Charlotte,Diego,William,
6. Using paste
The paste program is a utility that merges lines of files, but we can also use it to remove newlines.
Let’s try with the next one-liner:
$ paste -sd "" some_names.txt > some_names.csv
Now, let’s check our CSV file:
$ cat some_names.csv
Martha,Charlotte,Diego,William,
We’re able to achieve this because paste has the parameters -s, which pastes one file at a time leaving each one as a row, and -d, which allows us to define the empty string as the delimiter.
With these two paste options, we can get what we want without mentioning the newline.
7. Using sed
When we talk about processing text, the sed stream editor usually comes to mind, regardless of the problem.
The script ‘s/
Let’s use it to replace the line endings and see what happens:
$ sed 's/\n//g' some_names.txt
Martha,
Charlotte,
Diego,
William,
And there’s no change because sed reads one line at a time, and then the newline is always stripped off before it’s placed into the pattern space.
Let’s try with this new one-liner:
$ sed ':label1 ; N ; $! b label1 ; s/\n//g' some_names.txt > some_names.csv
Next, let’s see what’s inside our CSV file:
$ cat some_names.csv
Martha,Charlotte,Diego,William,
Now we have what we wanted.
Let’s break down each section (separated by the semicolon) of the script to understand how it works:
- :label1 creates a label named label1
- N tells sed to append the next line into the pattern space
- $! b label1 tells sed to branch (go to) our label label1 if not the last line
- s/\n//g removes the \n character from what is in the pattern space
In other words, with all these pieces together, we construct a loop that finishes when sed is in the last line of the input.
8. Using a Bash Command-Line Script
Bash is installed in most Linux distributions, so we could try to use it to get what we want.
One option that we could use is a while loop:
$ while read row
do
printf "$row"
done < some_names.txt > some_names.csv
Here, in the while loop and with the help of the Bash built-in read, we read the content of the file some_names.txt, and then we assign each line to the variable row.
After that, the built-in printf prints that line without the newline. And finally, we redirect the output to our CSV file.
We can achieve the same with the help of the readarray built-in, the IFS variable, and the parameter expansion mechanism:
$ OLDIFS=$IFS ; IFS='' ; readarray -t file_array < some_names.txt ; echo "${file_array[*]}" > some_names.csv ; IFS=$OLDIFS
Bash is full of tricks, and we’re using a few of them here. Let’s understand it section by section:
- OLDIFS=$IFS: We save the current variable IFS into the OLDIFS variable.
- IFS=”: We define IFS to the empty string
- With readarray -t file_array …, we assign to the array file_array the content of the some_names.txt file removing the newline from each row
- With “${file_array[*]}”, Bash expands each value of the array file_array, separated by the first character of the IFS variable
- Finally, we restore the IFS variable
But we can be a little trickier using a subshell:
$ (
readarray -t file_array < some_names.txt;
IFS='';
echo "${file_array[*]}" > some_names.csv;
)
This is equivalent while keeping our current IFS variable safe, thanks to the fact that variables inside the subshell aren’t visible outside of it.
It’s worth mentioning that the IFS variable is special. The default value of the Bash IFS variable is
Finally, let’s see what is now inside our CSV file:
$ cat some_names.csv
Martha,Charlotte,Diego,William,
9. Using the Vim Editor
In Linux, we have many editor flavors, but let’s focus on one of the most famous.
Vim (Vi Improved) is an editor equipped with a lot of useful utilities.
Let’s open our example file into the Vim editor: