1. Overview
In this tutorial, we’ll see how we can iterate over the output of the ls -l command. By default, when we read the output of a command, the lines are split by the word boundaries. But we expect to process the line as a whole till we reach a new line character. We’ll see how we can change this default behavior.
2. The Problem
First, let’s look at the problem at hand. Consider we have three files:
$ ls -l
total 16
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
Let’s use a simple for loop to iterate over the result of the ls -l command:
$ for line in $(ls -l); do echo $line; done
total
20
-rw-r--r--
1
bluelake
bluelake
3278
Feb
11
17:01
test1.txt
-rw-r--r--
1
bluelake
bluelake
3227
Feb
11
17:01
test2.txt
-rw-r--r--
1
bluelake
bluelake
7392
Feb
11
17:01
test3.txt
Here, we can see that the result is split into different lines by word boundaries. Each word from the output is taken and sent to the standard out.
Clearly, this is not what we need. So now, let’s see the different ways we can fix this.
3. Changing the IFS Variable
As seen above, the reason for splitting by words is because the Internal Field Separator (IFS) variable is set to the default value. And with the default value, it splits the word by white space.
Let’s try changing this to a new line and see how it’ll work:
$ IFS='
> '
$ for line in `ls -l`; do echo $line; done
total 20
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
The above results show that instead of processing word by word, it processed one complete line correctly.
However, if we change the value for the IFS variable like this, it’ll impact all the commands going to run in the same session.
To avoid that, we can run this in a sub-shell:
$ (IFS='
> '
> for line in `ls -l`; do echo $line; done)
total 20
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
Now, if we run just the for loop again, it’ll print each word in a line since we didn’t change the IFS variable for this session.
4. Using the read Command
As we know, we can use the read command to read a line from the standard input and split it into words. So let’s see how we can use the read command for our needs:
$ ls -l | while read line; do echo $line; done
total 20
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
We can see that the read command reads the whole line into the line variable. And from that variable, we were able to print each line to the standard out using the echo command.
5. Using the awk Command
The awk is a wonderful utility to process text and streams. Let’s see how can we use the awk command for this specific task:
$ ls -l | awk '{print $0}'
total 24
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
From the above results, we can see we’ve iterated through all the lines in the command output. Here, we’ve used the $0 option of the awk command to print the entire record.
As we know, the awk command gives us the option to choose one or more columns by mentioning their indices. That way, it gives us more flexibility by selecting only those columns we need in our output.
6. Using the xargs Command
Using the xargs command, we know we can get the arguments from the standard input and supply them to another command. Let’s check how we can use the xargs command for this:
$ ls -l | xargs -I{} echo "{}"
total 24
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
From the results, we can see that each line is processed separately.
Using the -I option, we fetched one line from the output at a time. Then we can process that line. Here we echoed the line to the standard out.
7. Using the parallel Command
Finally, we can use the GNU parallel command, which runs tasks in parallel so that one task doesn’t wait for another to get started. Let’s see an example of that:
$ ls -l | parallel --jobs 4 echo
total 20
-rw-r--r-- 1 bluelake bluelake 3278 Feb 11 17:01 test1.txt
-rw-r--r-- 1 bluelake bluelake 3227 Feb 11 17:01 test2.txt
-rw-r--r-- 1 bluelake bluelake 7392 Feb 11 17:01 test3.txt
Here, we’ve used the jobs option to run four parallel jobs. This will come in handy when we’ve to process several big files in parallel.
8. Conclusion
In this tutorial, we’ve seen the different ways we can iterate over the output of the ls command. Even though this is done for output from the ls command, we can apply the same techniques to other commands as well when we need to read the output line by line.