1. Overview
In this tutorial, we’ll see how to loop through the contents of a file, line by line. This might sound like a trivial task, but there are some caveats that we need to be aware of.
The examples in this article have been tested in Bash. But they should work for other POSIX compatible shells as well.
2. Example File
Let’s assume we have a text file, and we want to echo each line (without using cat, of course). The file is called lorem-ipsum.txt, and its contents are:
Lorem
ipsum
dolor
sit
amet
3. While Loop
To output each line, we create a bash script and use a while loop to iterate through our file and call it echo-lines.sh:
while read line; do
echo $line
done < lorem-ipsum.txt
Using the angle bracket, we pass the contents of lorem-ipsum.txt to the while loop, line by line. When we run it, the output is as we expected:
$ ./echo-lines.sh
Lorem
ipsum
dolor
sit
amet
It seems we have found an easy way to loop through the contents of a file. However, in the following sections, we’ll see some caveats we need to be aware of.
3.1. White Spaces
Now, imagine our file contains some leading white space on line 2:
Lorem
ipsum
dolor
sit
amet
Again, let’s run our script:
$ ./echo-lines.sh
Lorem
ipsum
dolor
sit
amet
We get the same output as before. While reading the lines of the file, bash ignores leading white spaces because it considers them to be separator characters.
To solve this, we need to clear the input field separators or IFS environment variable. We add a statement at the beginning of our script:
IFS=''
while read line; do
echo $line
done < lorem-ipsum.txt
On the first line, we clear the input field separators, and now our script will print the expected result:
$ ./echo-lines.sh
Lorem
ipsum
dolor
sit
amet
3.2. Escape Characters
We’re not quite there yet. Let’s see what happens when our file contains a backslash, the escape character used in bash:
Lorem
ipsum
dolor
sit\
amet
Again, we run our script:
$ ./echo-lines.sh
Lorem
ipsum
dolor
sitamet
As we see from the result, not only is the backslash removed but also the last two lines are printed as one. By default, the read command will treat backslashes as escape characters. This leads to unexpected results, like in our case where the next newline character is ignored.
To fix this, we’ll use read -r to disable backslash interpretation:
IFS=''
while read -r line; do
echo $line
done < lorem-ipsum.txt
Now it correctly prints the contents of our file:
$ ./echo-lines.sh
Lorem
ipsum
dolor
sit\
amet
6. Conclusion
In this short article, we learned that looping through the contents of a file seems like a trivial task, but there are some caveats we need to be aware of. We should always clear the input field separators environment variable IFS. Also*,* we need to instruct the read command not to interpret escape characters.