1. Overview
In this tutorial, we’ll look at how we can concatenate files by inserting a separator in Linux. There are cases where we need to simply concatenate the files or add a separator between the merged files.
Let’s say we’ve three files fruits.txt, vegetables.txt, and meat.txt:
$ cat fruits.txt
Apple
Orange
Grapes
$ cat vegetables.txt
Cabbage
Lettuce
Broccoli
$ cat meat.txt
Pork
Beef
Mutton
We’ll look at different ways how we can do that.
2. Using a Loop
We can write a simple Bash one-liner using a for loop to concatenate these files. Inside the loop, we cat each file, then redirect the output to a separate file.
Let’s see how we can accomplish this:
$ for f in *.txt; do cat $f >> out.txt; done;
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
With that command, we can see that the contents of the files are merged to the out.txt file.
Using a for loop is flexible. We can do additional processing, like adding a new line between the files or inserting a different separator. For that, we just need to add the commands between the do and the done keywords:
$ for f in *.txt; do cat $f >> out.txt; echo >> out.txt; done;
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
Here, we used the echo command to insert a new line between the files, and as we see, it is working perfectly fine.
3. Using the find Command
This is similar to the loop solution above. Instead of using a for loop, we can use the find command to simulate the loop.
Let’s see this in action:
$ find *.txt -exec cat {} \; > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
We can see that it concatenated the contents of all files to out.txt. The find command looks for all the txt files and using the -exec option, we can cat them to produce the out.txt file.
We can add more -exec option to insert newlines between the files:
$ find *.txt -exec cat {} \; -exec echo \; > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
Similarly, we can also use the xargs command along with the find command to concatenate files:
$ find *.txt | xargs -I{} sh -c "cat {}; echo" > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
4. Using the sed Command
We use the sed command to modify a text input stream. Let’s see how we can use the sed command to merge files:
$ sed '' *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
As we can see above, we’ve used the sed command’s default action to merge the files.
Let’s see how we can use the sed command to insert a newline after each file. As we know, the sed command has options to identify the end of text input and the end of a line using the $ symbol.
Let’s see an example:
$ echo "test end of line" | sed '$s/$/\n/'
test end of line
$
Here, we can see a new line is inserted after the text input. Let’s breakdown the sed command part:
- $s – selects range as the last line
- $ – second symbol stands for the end of the line
- \n – is the substitute for the end of the line
Hence, it takes the last line of the file, finds the end of the line, and replaces it with a new line.
Let’s use this to insert a new line between the files:
$ sed -e '$s/$/\n/' *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
From the above results, we can see that the output contains a new line, but only at the end of the file. We need a new line after each file.
To fix this, after the expression option (-e), we can use the -s option of the sed command. That’ll make sed to process the files separately, and we get a new line at the end of each file.
Let’s take a look at it:
$ sed -e '$s/$/\n/' -s *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
Now, we can see the output has a new line after the end of each file. Further, if we need a different separator between the files, we can add that separator string after the newline in the sed pattern.
5. Using the awk Command
The awk is a powerful command-line utility used for processing text. We’ll use the print and $0 keywords in AWK language to concatenate the files:
Let’s look at a simple example to merge files:
$ awk '{print $0}' *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
As shown above, the print command displays the value, while $0 stands for the record being processed.
Thus, the command prints each record from the given files to concatenate them.
Let’s look at how we can insert a separator after each file. We can use the END keyword for this. It’ll identify the last line in the record.
Let’s modify the command to include this:
$ awk '{print $0} END{printf "\n"}' *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
As we can see, it has printed the newline but only in the last line. It didn’t print the new line after each file.
To resolve this, we can use the FILENAME identifier in the AWK language. It holds the name of the file currently processing. Using this identifier, we can check whether the file has changed and then insert a new line:
$ awk '{ if (FILENAME != file){ if (file) printf "\n"; file = FILENAME } } {print $0} END{printf "\n"}' *.txt > out.txt
$ cat out.txt
Apple
Orange
Grapes
Pork
Beef
Mutton
Cabbage
Lettuce
Broccoli
$
Even though this is a bit cumbersome, from the results, we can see a newline is inserted after each file. Of course, it would be silly to use this for concatenating files. But we’ve learned a thing or two of awk command with this exercise.
6. Conclusion
In this tutorial, we’ve seen the different ways we can concatenate files. We’ve also seen how we can insert a separator while merging them.