1. Overview
It’s a common requirement to covert whitespaces to tabs and vice-versa. Programmers often need to do this to follow the coding guidelines of a project. Fortunately, most popular IDEs and editors provide in-built support for it.
In this tutorial, we’ll discuss some of the ways to replace whitespaces with tabs from the command line.
2. Setup
In general, whitespace contains horizontal and vertical whitespace characters. Further, the Unicode character set defines some additional whitespace characters,
In this tutorial, when we mention whitespace, it means ASCII horizontal whitespace characters.
Now, let’s create a simple text file with some whitespaces to use as an example:
$ cat --show-tabs input.txt
The quick brown fox jumps over
the lazy dog
In the above example, we’ve used the –show-tabs option of the cat command. If there were any TAB characters in our input file, they’d show up as ^I. Note that there aren’t any tabs in the input.txt file. Hence, we don’t see any ^I characters in the output.
3. Using the tr Command
The tr command is useful when we want to translate or delete characters. We can use it to convert spaces to TAB characters:
$ tr " " "\t" < input.txt > output.txt
$ cat --show-tabs output.txt
The^I^Iquick^I^I^I^I^Ibrown^I^I^Ifox^I^I^Ijumps^I^I^Iover
^I^I^Ithe^Ilazy^I^I^I^I^I^I^Idog
In this example, we’re replacing each space with a TAB character. However, sometimes, the requirement is to replace multiple spaces with a single TAB character. We can easily achieve this using the -s option of the tr command:
$ tr -s " " "\t" < input.txt > output.txt
$ cat --show-tabs output.txt
The^Iquick^Ibrown^Ifox^Ijumps^Iover
^Ithe^Ilazy^Idog
In this example, -s represents the squeeze-repeats operation, which replaces multiple spaces with a single TAB character.
4. Using the awk Command
The awk command is an interpreter for the AWK programming language. It’s a very powerful tool for performing complex text processing. With the help of the awk command, we can easily convert whitespaces to TAB characters.
By default, AWK uses [ \t\n]+ as Field Separator (FS) and a space character as an Output Field Separator (OFS).
We can set the two variables to solve our problem:
$ awk -F'[[:blank:]]' -v OFS="\t" '{$1=$1; print}' input.txt > output.txt
$ cat --show-tabs output.txt
The^I^Iquick^I^I^I^I^Ibrown^I^I^Ifox^I^I^Ijumps^I^I^Iover
^I^I^Ithe^Ilazy^I^I^I^I^I^I^Idog
In the command above, we’re setting the TAB character as an Output Field Separator. Also, we set one single horizontal whitespace character as the Field Separator.
Therefore, awk read fields separated by a single whitespace character, and output them separated by TABs.
If we don’t set the FS variable, awk will replace multiple whitespace characters with a single TAB character:
$ awk -v OFS="\t" '{$1=$1; print}' input.txt > output.txt
$ cat --show-tabs output.txt
The^Iquick^Ibrown^Ifox^Ijumps^Iover
the^Ilazy^Idog
So far, we’ve solved the problem using awk.
However, curious eyes may spot that “*$1=$1*” looks strange since it seems that it does nothing.
Actually, it’s the key to the two awk commands. When a field is set, no matter if the value is changed or not, awk will apply some internal variables such as OFS to the record. Here, we want awk to apply our customized OFS to the record. Therefore, we reset a field to trigger it.
If we print the record without setting at least one field, awk won’t apply the new OFS to the record:
$ awk -v OFS="\t" '{print}' input | cat --show-tabs
The quick brown fox jumps over
the lazy dog
As we can see from the output above, there is no TAB in the output from awk, although we’ve set OFS=”\t”. awk outputs the file content as it is, without any changes.
5. Using the sed Command
sed is a stream editor for filtering and transforming text. We can use its substitute command for converting whitespaces to TABs:
$ sed 's/[[:blank:]]/\t/g' input.txt > output.txt
$ cat --show-tabs output.txt
The^I^Iquick^I^I^I^I^Ibrown^I^I^Ifox^I^I^Ijumps^I^I^Iover
^I^I^Ithe^Ilazy^I^I^I^I^I^I^Idog
In this example, the ‘s’ character represents the substitute command, whereas ‘g’ represents the global flag that performs the operation on all matched patterns.
We can use an extended regular expression with the sed command to convert multiple whitespaces to single TAB characters:
$ sed 's/[[:blank:]]\+/\t/g' input.txt > output.txt
$ cat --show-tabs output.txt
The^Iquick^Ibrown^Ifox^Ijumps^Iover
^Ithe^Ilazy^Idog
Since sed uses BRE by default, we need to escape the ‘*+*‘ character to make it have special meaning: matching one or more occurrences of whitespace.
6. Using the vim Editor
Vim is one of the most popular and powerful text editors in Linux. It has support for multiple modes. We can use its EX mode commands for character conversion:
$ cat --show-tabs input.txt
The quick brown fox jumps over
the lazy dog
$ vim input.txt
:%s/\s/\t/g # execute this command in Vim's ex mode
:wq # execute this command in Vim's ex mode
$ cat --show-tabs input.txt
The^I^Iquick^I^I^I^I^Ibrown^I^I^Ifox^I^I^Ijumps^I^I^Iover
^I^I^Ithe^Ilazy^I^I^I^I^I^I^Idog
We can adjust the :s command slightly to replace multiple whitespaces with a single TAB character:
:%s/\s\+/\t/g
:wq
$ cat --show-tabs input.txt
The^Iquick^Ibrown^Ifox^Ijumps^Iover
^Ithe^Ilazy^Idog
Vim uses magic for Regex patterns by default. Therefore, we need to escape the ‘*+*‘ character to give it special meaning: match the pattern one or more times.
Vim supports automatically executing some Ex commands after reading the file:
vim "+ExCommand" "+ExCommand" "+ExCommand" .. file
That is to say, apart from opening the file in the Vim editor and executing the :s command interactively, we can also use Vim as a text processing command to do the replacement:
$ vim "+%s/\s\+/\t/g" "+wq" input.txt
$ cat --show-tabs input.txt
The^Iquick^Ibrown^Ifox^Ijumps^Iover
^Ithe^Ilazy^Idog
7. Conclusion
In this tutorial, we discussed various practical examples to convert spaces to TABs.
First, we saw examples of the tr command. Then, we used the awk and sed commands. Finally, we saw the usage of the Vim editor. We can use these examples in our daily life to boost our productivity.