1. Introduction
Case sensitivity decides whether there’s a difference for a given script or software if letters are capitalized or not. This is valid for any language and locale. Since text is part of most operating system (OS) interactions, knowing the case sensitivity of the current context can be very important. For example, hostnames are generally case-insensitive, but code usually differentiates between cases.
In this tutorial, we talk about ways to make all the contents of a string or file lowercase or uppercase. First, we convert a file to a string for cases where supplying a file path isn’t supported. Next, we go over methods to change the case of text with a basic encoding. Finally, we explore methods for converting the case of any valid character within a string or file with two of the most standard tools in a Linux environment.
We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. It should work in most POSIX-compliant environments unless otherwise specified.
2. Read File
There are many ways to read a file, but that’s always the first step of any operation over its contents. While some commands work directly with the file path, others need a string and don’t include the input and output mechanics beyond the usual stdout and stdin.
Since loop constructs would mostly make our solutions complex in this case, and we don’t require them, we can slurp the file instead of iterating over the file contents:
$ filecontents="$(cat file)"
After running this line, cat ensures all bytes from file end up in the filecontents variable.
Naturally, this can be a problematic approach for large files, but it generally works well, depending on the system.
3. ASCII Encoding
When dealing with the ASCII Latin encoding, there are usually many ways to convert to lowercase and uppercase. To ensure we’re only working with single bytes, we can prefix any of the commands below with a temporary locale setting to C via LC_ALL:
$ LC_ALL=C [...]
Importantly, we don’t apply any modifications in place. Instead, we can get and save a copy via redirection. This way, we protect against unexpected changes.
Let’s explore some of the more basic and readily available methods.
3.1. tr
The standard tr command is usually the go-to method for text translation and transformation.
Because of this, let’s see how to convert all characters of a file to lowercase via tr:
$ tr A-Z a-z < file
Naturally, we can convert to uppercase by switching the ranges:
$ tr a-z A-z < file
Alternatively, we can employ the POSIX regular expression groups [:lower:] and [:upper:]:
$ tr '[:upper:]' '[:lower:]' < file
The situation is the same with the reversal:
$ tr '[:lower:]' '[:upper:]' < file
Generally, the latter two are preferred.
3.2. awk
AWK is another standard tool for text processing.
To convert the contents of a file to lowercase via the awk interpreter, we can use a one-liner:
$ awk '{print tolower($0)}' < file
In this case, we use the print statement combined with the tolower() or topper() functions to transform the $0 current record.
For the reverse, going uppercase, we switch the function:
$ awk '{print toupper($0)}' < file
Again, modifications aren’t applied back to the file but are instead printed to stdout.
3.3. sed
The GNU sed command can use a special regular expression feature to perform the conversion we’re after:
$ sed 's/.*/\L&/g' < file
Here, *the *sed**s/ substitution operation is [/g]lobally applied to .* all characters, replacing each one with its [\L]owercase or [\U]ppercase & equivalent**.
Thus, we can get all letters to be capitalized as well:
$ sed 's/.*/\U&/g' < file
By just replacing the command in the substitution, we can convert to either case.
3.4. dd
Since ASCII characters are single bytes, the dd can be tailored to process strings:
$ dd conv=lcase 2>/dev/null < file
In short, we use the lcase or ucase values of the conv dd option to convert all input bytes to their lowercase or uppercase form.
Let’s see ucase in action:
$ dd conv=ucase 2>/dev/null < file
In both cases, we redirect to /dev/null, so we hide the summary from the stderr output of dd.
3.5. perl
Of course, interpreters like Perl usually provide functions for character transformation:
$ perl -ne 'print lc' file
In this case, *we [-e]xecute a [-n]on-printing perl one-liner, which just [print]s each line after passing it through the built-in lc() or uc() functions*.
Let’s see an example with uppercase as well:
$ perl -ne 'print uc' file
While adding use utf8 or use locale should theoretically deal with any encoding, in practice, this depends on the shell context and additional configuration.
4. Any Encoding
Naturally, there are methods to convert text data with any encoding to lowercase or uppercase, skipping any symbols without the concept of a case, such as numbers, symbols, and others.
To ensure portability and compatibility, we focus on two main options.
4.1. Bash Versions Before 4
Before Bash 4, we’d need to implement a custom case conversion mechanism, as there are no built-in ways to switch between cases. While ASCII supports a fairly simple algorithm, which adds or removes a fixed value to the code of the character, Unicode isn’t as straightforward.
4.2. Bash 4
After Bash 4, parameter expansion variants were specifically created for case conversion:
$ echo "${filecontents,,}"
In this case, the ,, double commas indicate that the variable with the name that precedes them should be expanded with all uppercase characters converted to lowercase. To get the reverse, lowercase to uppercase, we use ^^ double carets:
$ echo "${filecontents^^}"
If we specify a pattern after the commas or carets, we limit the effect to the parts of the variable value that match the pattern.
4.3. Bash 5.1
Since Bash 5.1, we can also use an alternative syntax for both operations:
$ echo "${filecontents@L}"
In this case, Bash applies the [@L]owercase transformation to all relevant characters.
When it comes to [@U]uppercase, we just switch the letter:
$ echo "${filecontents@U}"
Since the shell and the terminal dictate much of the encoding context, working within Bash directly usually has a great chance to preserve and properly apply that encoding.
That’s especially true when using a prepopulated string like filecontents.
4.4. Vi Editor
The ubiquitous Vi editor provides many shortcuts for text manipulation. For brevity, we use vi (Vi) when referencing both the Vi and Vim editors.
In particular, we’ll explore a series of Normal (Command) Mode commands for converting to lowercase:
$ vi file
[...]
gg0guGZZ
Now, let’s break down what this sequence does:
- open file
- gg (to first line) followed by 0 goes to the first line and column
- gu (lowercase) or gU (uppercase) followed by G (to last line) lowers the case of all the characters until the bottom of the file
- ZZ saves the current buffer changes and exits
So, the change for uppercase is again a single letter:
$ vi file
[...]
gg0gUGZZ
In fact, we can convert either of the above into a single automated command, which saves the changes in-place:
$ vi -c'normal! gg0gUGZZ' file
Here, the -c switch enables the execution of the gg0guGZZ or gg0gUGZZ normal ! mode commands after file is open. Since we also add the ZZ, we may not even see the editor screen before the operation is complete, depending on the file size.
This method, like the latter, handles encoding very well, as it treats the file based on its preconfigured or detected encoding.
5. Summary
In this article, we explored ways to change the text case of strings and files.
In conclusion, while there are many methods to perform case conversion, we explored some of the most basic and standard ones.