1. Introduction
Source code, command lines, and most computer interaction at its most basic level consist of characters. On the other hand, most characters are not represented by keys on a regular keyboard, many are not printable at all, and yet another group are complex control characters.
In this tutorial, we’ll discuss character escaping in Bash. First, we briefly describe how machines represent characters. After that, we explore types of strings in Bash. Next, the character escaping in pure Bash is discussed in detail. Finally, we look at specific cases where escaping is involved.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It is POSIX-compliant and should work in any such environment.
2. Characters
Usually, apart from a pointer, users have one other method to enter data – text. To represent text, machines use a succession of bytes. They encode characters based on a predefined code table, usually ASCII or Unicode.
Types of characters roughly include:
- printable characters (e.g. ,
, , ) and - non-printable control characters (e.g.
, )
Since we’re dealing a lot with characters, in this article, we use the angle bracket notation to represent them with names from these tables.
Importantly, we must have a way to write out any character we need, be it ASCII, Unicode, or a custom encoding. Unfortunately, we have a minimal set of keyboard keys to represent many different text symbols.
3. Bash Strings
Writing and storing characters are two separate actions because no keyboard has keys for every possible symbol.
In Bash, the text is stored as strings. In fact, all Bash variables are just strings of characters. They are usually direct, single-quoted, or double-quoted sequences.
Importantly, the difference between these methods is that we interpret or interpolate certain combinations of characters in one context and take them literally in another.
3.1. Single Quotes
Within single quotes, we don’t interpolate anything:
$ text='a $(echo b) c'
$ echo "${text}"
a $(echo b) c
Note how all text within the single quotes is preserved. No interpolation is done, but this means we also can’t, under any circumstances, have a single quote directly within the single quote.
3.2. Double Quotes
When using double quotes, we preserve the literal value of most characters:
$ text="a"
$ text="${text} $(echo "b") c"
$ echo "${text}"
a b c
3.3. No Quotes
As long as the string adheres to certain rules, we can skip the quotes:
$ text=a$(echo b)c
$ echo ${text}
abc
We discuss some of the rules in the next section.
3.4. Special Quoting
Double-quoting text with a
Single-quoting text with the same prefix is treated differently. In this case, escaped characters are replaced.
In the next section, we’ll clarify what escaping means.
4. Bash Character Escaping
Except within single quotes, characters with special meanings in Bash have to be escaped to preserve their literal values. In practice, this is mainly done with the escape character *\*
Let’s see when and how we use which method.
4.1. Double Quotes
We escape text inside double-quoted strings by prefixing a character with
$ text1="a $(echo b) c"
$ text2="a \$(echo b) c"
$ echo "${text1}"
a b c
$ echo "${text2}"
a $(echo b) c
Note how, in the case of text2, the
These are all special characters, which may have to be escaped to preserve their literal meaning within double quotes:
- $
, e.g. $() and ${} - `
, also known as the backquote operator - ”
, when we need a double quote within double quotes - newline
, which is equivalent to under Linux - \
, when prefixing a character in this list except - !
, when history expansion is enabled outside POSIX mode, usually the case - ~
, when beginning a string, to avoid tilde expansion and confusion with the $HOME directory
Furthermore, the
$ text="!event"
bash: !event: event not found
$ text="\a \$ \` \!event \\"
$ echo ${text}
\a $ ` \!event \
Importantly, the
- prefixing it with a backslash (which remains, same as with a normal character like )
- using it at the end of a string or before whitespace characters
- enclosing it in single quotes to escape an
- disabling history expansion via set +o histexpand
- being in POSIX mode
Finally, the combination
$ text="a \
> b"
$ echo "${text}"
a b
Let’s now explore how Bash treats sequences without any quotes.
4.2. No Quotes
As we already showed, we can forgo the quotes altogether, but there is a price.
Namely, any sequence without quotes wouldn’t be unified without escaping all characters, which are not alphanumeric or part of the following group:
$ text=a\ \&\ b\ \&\ c
$ echo "${text}"
a & b & c
It’s rarely, if ever, preferable to not use quotes.
4.3. ANSI-C Combinations
When using $’STRING_TEXT’, the sequence within the single quotes expands to a string, with escaped characters replaced according to the ANSI-C quoting:
$ echo $'\u0061'
a
The \u escape sequence interprets the four digits directly following it as a hexadecimal code in the Unicode ISO/IEC 10646 table.
Importantly, where they are recognized, we can use the \u, \U, \x, and similar sequences to place any character without further escaping. Note that, in this case, the escape turns special meanings of characters on, not off. These are two ways to avoid the “shortage of keys” on a keyboard.
Moreover, many other tools use the ANSI-C standard.
5. Special Cases
Bash is a shell that has built-in commands and capabilities. Many use the ANSI standards, but some functionalities also use their own special control characters within strings.
Keep in mind that any string, which we pass through Bash, first gets interpreted by Bash. This means all rules from the previous section apply, but we may build on top of them in this one.
5.1. Bash Prompt
The first thing we see when using Bash is prompt. It normally shows some useful information about the machine, user, current directory, etc. All of these are stored as defaults in variables P0, P1, P2, and P4.
However, we can modify these variables. Furthermore, we can use terminal control characters to customize our prompt:
$ echo "Current prompt: ${PS1}"
Current prompt: $
$ PS1='\t> '
00:00:10> echo "Current prompt: ${PS1}"
Current prompt: \t>
These sequences start with
5.2. ANSI Escape Sequences
Within many terminals, we can also use other escape codes like the standard ANSI escape sequences. For example, there are ways to change the color of terminal text, cursor location, fonts, and other options. These sequences start with
$ PS1="TESTING\033[1K> "
> echo "Current prompt: ${PS1}"
Current prompt: TESTING\033[1K>
In this example, the so-called control sequence introducer <ESC>
As already mentioned, ANSI and ANSI-C escape sequences are used throughout the Linux ecosystem. For example, both echo and printf recognize them. We need the -e parameter to echo, but printf works with ANSI by default.
5.3. printf
The standard built-in printf (Print Function) command also has its own special character.
Recall our discussion of writing strings without quotes. The characters we would need to escape in that instance are in the output of the following script:
$ for code in {0..127}; do
> printf -v chr '\\%o' "${code}"
> printf -v chr "${chr}"
> printf -v echr "%q" "${chr}"
> if [[ "${chr}" != "${echr}" ]]; then
> printf "%02X %-7s\n" "${code}" "${echr}"
> fi
> done
00 ''
01 $'\001'
[...]
07 $'\a'
08 $'\b'
09 $'\t'
0A $'\n'
[...]
The snippet above goes through the first 128 characters in the ASCII table. For each, it uses printf to extract and compare each character with its escaped form.
First, %o returns the octal form of the character’s code. Next, this value is reused in printf with a
Note the %
5.4. Parameter Transformation
As of version 4.4, Bash supports parameter transformation. This functionality allows us to perform many of the operations that printf and other built-ins have, but directly within Bash.
For example, we can use echo ${VAR@Q} as a replacement for printf with %q:
$ text='\'
$ printf '%q\n' "${text}"
\\
$ echo "${text@Q}"
'\'
As we already learned, *\\* and ‘\’ are equivalent.
Both of the approaches above are very useful when it comes to multilevel escaping:
$ text="6*6*6 equals 216"
$ text="$(printf '%q' "${text}")"
$ text="$(printf '%q' "${text}")"
$ echo "${text}"
6\\\*6\\\*6\\\ equals\\\ 216
Indeed, without a way to perform this operation automatically, manual escaping of long lines often leads to many errors.
5.5. Command-Line Arguments
Many standard Bash built-ins use the –
6. Summary
In this tutorial, we discussed character escaping in Bash. We first learned that characters have different encoding tables. In addition, we saw that some are characters are not printable but are only a marker or command text. To use such characters literally, we need the means to escape them. We explored pure Bash, as well as some common Bash built-in character escaping cases.
In conclusion, character escaping is only partially standardized, so many obscure scenarios and tools exist where escaping a character is non-trivial.