1. Overview
In Linux, there are a lot of command-line utilities for text manipulation at our disposal. In this tutorial, we’ll discuss the tr command.
2. Introduction to the tr Command
tr is short for “translate”. It is a member of the GNU coreutils package. Therefore, it’s available in all Linux distros.
The tr command reads a byte stream from standard input (stdin), translates or deletes characters, then writes the result to the standard output (stdout).
The usage syntax of tr is pretty straightforward:
tr [OPTION] SET1 [SET2]
If we don’t pass any options to tr, it will replace each character in SET1 with each character in the same position in SET2.
Since tr doesn’t support reading a file directly, if we want to apply it to a text file, we need to pipe the file content to tr or redirect the file to stdin.
We can use tr to perform text transformations like:
- character case conversion
- squeezing repeating characters
- deleting specific characters
- basic text replacement
Let’s go through some examples to learn how to manipulate text using tr.
3. Convert Lowercase to Uppercase
We’ll start with a basic problem: converting all lowercase characters to uppercase in a file.
Let’s have a look at how to solve the problem using the tr command:
$ cat baeldung.url
www.baeldung.com
$ tr 'a-z' 'A-Z' < baeldung.url
WWW.BAELDUNG.COM
In the example above, we redirected a file baeldung.url to stdin and asked tr to do the case conversion. We used character ranges in both SET1 and SET2 to do the case conversion.
Alternatively, we can also solve the problem using a couple of built-in character set aliases:
$ tr '[:lower:]' '[:upper:]' < baeldung.url
WWW.BAELDUNG.COM
Because tr will only write the result to stdout, after executing it, our baeldung.url file is not changed. If we want the translated result to be written back to the input file, we can redirect stdout to a temporary file and then rename and overwrite the input file:
$ tr 'a-z' 'A-Z' < baeldung.url >tmp.txt && mv tmp.txt baeldung.url
$ cat baeldung.url
WWW.BAELDUNG.COM
4. Basic Find and Replace
The tr utility is handy for some simple “find and replace” operations where one character should be replaced with another. For example, let’s replace all hyphens with underscores in a file:
$ cat env.txt
$JAVA-HOME and $MAVEN-HOME are system variables.
$ cat env.txt | tr '-' '_'
$JAVA_HOME and $MAVEN_HOME are system variables.
Instead of using redirection, we used the cat command to pipe the content of the file env.txt to tr.
In addition to finding and replacing a single character, tr can do multiple character replacement as well. Let’s see another example of translating braces into parenthesis:
$ echo "{baeldung}" | tr '{}' '()'
(baeldung)
We can also use tr to do character range translation. Now let’s see an example of encrypting and decrypting a secret message (“this is a secret message“) using tr:
In this example, we will use the simple Caesar cipher as our algorithm of encryption: replace each letter in input text by a letter some fixed number of positions down the alphabet — for example, changing “a” into “e“, changing “b” into “f “, and so on:
$ echo "this is a secret message" | tr 'a-z' 'e-zabcd' > secret.txt
$ cat secret.txt
xlmw mw e wigvix qiwweki
To decrypt the secret file, we exchange SET1 and SET2 in the tr command above:
$ tr 'e-zabcd' 'a-z' < secret.txt
this is a secret message
5. Truncate a Search Pattern
If we review the examples in the previous section, we notice that SET1 and SET2 we passed to tr always had the same length.
Let’s see what tr will give us if SET2 is shorter than SET1:
$ echo "abcdefg" | tr 'abcdefg' 'ABC'
ABCCCCC
When SET2 is shorter than SET1, the tr command will, by default, repeat the last character of SET2. Therefore, we see in the output above, the last letter in SET2, which is a “C“, is repeated to match letters from “d” to “g“. So the command turns into tr ‘abcdefg’ ‘ABCCCCC’.
We can use the truncate option “-t” to change this default behavior, to let tr limit the matching to the length of SET2:
$ echo "abcdefg" | tr -t 'abcdefg' 'ABC'
ABCdefg
6. Squeeze Repeating Characters
We can remove repeated instances of a character using tr with the squeeze option “-s“.
Let’s see an example of converting multiple continuous spaces to a single space:
$ echo 'Hi, nice to meet you!' | tr -s ' '
Hi, nice to meet you!
If we pass the -s option together with SET1 and SET2 to tr, it will first do the translation*,* then squeeze repeated characters in SET2. For example:
$ echo 'TODAYYYY IIIS SOOO COOOLD ~' | tr -s 'A-Z' 'a-z'
today is so cold ~
7. Delete Specific Characters
We can pass the “-d” option to tr to delete characters in SET1.
Not like the “-s” option, when we pass “-d” together with SET1 and SET2 to tr, the SET2 will be ignored and no translation will be done.
For example, we want to delete all lowercase letters from the input text:
$ echo "A a B b C c" | tr -d 'a-z'
A B C
8. Search for the Complement of SET1
We can pass the option “-c” to tr, to make it search for a complement of SET1. Searching for the complement of SET1 means searching for the inverse of SET1.
Sometimes this option can simplify the definition of SET1.
For example, we would like to match any character that is not a lowercase letter and translate it into whitespace:
$ echo "tr@is#aMvery~handy tool" | tr -c 'a-z' ' '
tr is a very handy tool
We can also combine the “-c” option with “-d” or “-s”. The next example shows how can we extract the id number from the customer.csv file:
$ cat customer.csv
Name,Country,Id
James,USA,1234567890
John,Canada,0987654321
$ cat customer.csv | tr -cd '0-9\n'
1234567890
0987654321
In the example above, we must include the newline character (\n) in SET1. Otherwise, all line breaks in the input file will be deleted too, and we will have all id numbers concatenated together in one single line. This is not what we want.
9. Conclusion
In this article, we’ve learned how to use the tr command through various examples.
The tr command is a good choice if we need to do some fundamental text transformation, such as case conversion or squeezing repeating characters.
However, if we are facing complex text processing problems, we should think about more powerful utilities like awk or sed.