1. Introduction
Many shells offer features that help ease text processing and data propagation, like passing the output of one command as an argument to another.
In this tutorial, we explore ways to group functionally related parts of text input with only newlines as their separator. First, we describe a problem that shell separators can cause. Next, we turn to a standard solution. After that, we discuss a newer shell feature. Finally, we go back to a simple loop with a builtin.
In most cases, temporarily turning off globbing using set with its -f flag is safer when it comes to raw text processing.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.
2. Shell Separator Problems
Let’s start with a composite command:
$ declare $(printf 'A=[L1]\nB=[L2]')
$ echo $A $B
[L1] [L2]
Here, the declare shell builtin in Bash uses the output of printf from a subshell to initialize variables.
In this case, the separator is a newline. Let’s make that more visible by replacing the \n escape sequence:
$ declare $(printf 'A=[L1]
B=[L2]'
)
Thus, the shell sees a proper declare command:
$ declare 'A=[L1]' 'B=[L2]'
Notably, the quotes are around valid declare expressions.
As a result, the variable values are also correct:
$ echo $A $B
[L1] [L2]
Yet, this code can backfire when we introduce more potential separators, such as whitespace:
$ declare $(printf 'A=[L 1]\nB=[L 2]')
-bash: declare: `1]': not a valid identifier
-bash: declare: `2]': not a valid identifier
In this case, there are spaces between L and the numbers, so declare is called differently:
$ declare 'A=[L' '1]' 'B=[L' '2]'
Here, *the quotes surround valid (A=[L) and invalid (1]) declaration expressions*. Hence, we see a warning, and the final values are incorrect:
$ echo $A $B
[L [L
So, how can we ignore other potential separators in favor of only one, such as the newline?
3. Changing $IFS
Bash provides the $IFS (Internal Field Separator) special shell environment variable. It dictates which characters are considered separators when automatically processing text.
When using command substitution, $IFS can significantly mutate the output.
In our earlier example, we can fix the problem by changing the default value of $IFS (space, tab, and newline) to a newline alone:
$ IFS=$'\n'
$ declare $(printf 'A=[L 1]\nB=[L 2]')
$ echo $A $B
[L 1] [L 2]
Now, the problematic syntax works fine. Critically, this solution ignores empty lines.
Normally, we want to preserve and restore the value of $IFS as it affects many aspects of the shell behavior:
$ BIFS="$IFS"
[...]
$ IFS="$BIFS"
In addition, we can apply our solution with a file and process substitution:
$ BIFS="$IFS"
$ IFS=$'\n'
$ cat file
A=[L 1]
B=[L 2]
$ declare $(< file)
$ echo $A $B
[L 1] [L 2]
$ IFS="$BIFS"
Yet, when it comes to files and redirection, we also have other options.
4. Using mapfile
The mapfile Bash builtin can read lines from stdin or a file descriptor into an array.
For example, we can read a whole file with one command:
$ cat file
A=[L 1]
B=[L 2]
$ mapfile -t lines < file
$ echo "${lines[0]}" "${lines[1]}"
A=[L 1] B=[L 2]
$ declare "${lines[@]}"
$ echo $A $B
[L 1] [L 2]
Here, -t removes the line separators, as the array elements don’t need them. If we employ a different delimiter, -d can specify it.
Importantly, mapfile preserves empty lines as empty elements in the resulting array.
Since mapfile was introduced in later versions of Bash, we can encounter an environment that doesn’t support this builtin. In these cases, we can use a classic alternative.
5. Using a Loop With read
A typical way to read a file line by line is a while loop:
$ while IFS= read -r line; do echo $line; done < file
A=[L 1]
B=[L 2]
Let’s break down this construct:
- $IFS gets temporarily reset
- read stores each whole [-r]aw input line into the $line variable without interpreting escape characters
- echo outputs $line on every iteration
Due to the reset value of $IFS and the rules for word splitting, we get whole lines instead of space-separated strings.
Now, let’s leverage this for our needs:
$ cat file
A=[L 1]
B=[L 2]
$ while IFS= read -r line; do lines+=("$line"); done < file
$ echo "${lines[0]}" "${lines[1]}"
A=[L 1] B=[L 2]
$ declare "${lines[@]}"
$ echo $A $B
[L 1] [L 2]
In essence, this example is equivalent to the one with mapfile but uses the while construct we described. In addition, empty lines are again preserved as array elements.
6. Summary
In this article, we looked at ways to split text only at newlines in the shell.
In conclusion, we can use the value of $IFS alone, but there are also solutions like mapfile and a simple read loop, which may perform better, depending on the circumstances.