1. Overview
In this tutorial, we’ll go over the standard locale environment variables present in Linux. We’ll cover some examples and use-cases and see how they work.
Along the way, we’ll learn that they help us in a multitude of scenarios — when dealing with the language, the time, or even the character encoding used by the OS.
2. Environment Variable Priority
The locale environment variables tell the OS how to display or output certain kinds of text. They’re prioritized, allowing us to influence which one(s) will come into play in various scenarios:
- LANGUAGE
- LC_ALL
- LC_xxx, while taking into account the locale category
- LANG
For example, we can have French set as a language using LANG, but with an American date-time format, using LC_TIME.
Let’s take a closer look at the various locale variables to see how this priority scheme plays out.
3. Locale Environment Variables
Before we go into the locale environment variables, let’s output our current settings, using the locale command:
$ locale
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=ro_RO.UTF-8
LC_TIME=ro_RO.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=ro_RO.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=ro_RO.UTF-8
LC_NAME=ro_RO.UTF-8
LC_ADDRESS=ro_RO.UTF-8
LC_TELEPHONE=ro_RO.UTF-8
LC_MEASUREMENT=ro_RO.UTF-8
LC_IDENTIFICATION=ro_RO.UTF-8
LC_ALL=
Next, we’ll observe how these variables affect our output, and we’ll see how they interact and work in conjunction with other environment variables.
3.1. LANG
The LANG environment variable deals with the language of a Linux system. When we specify a language using the LANG variable, it’ll use that variable to print out messages in the language we choose. If no language is set, or if a message doesn’t have a translation in that language, then it defaults to English:
$ export LANG=es_ES.UTF-8
$ man man
MAN(1) Utilidades del paginador del manual MAN(1)
NOMBRE
man - interfaz de los manuales de referencia del sistema
SINOPSIS
man [opciones de man] [[sección] página ...] ...
// ...
$ man cat
CAT(1) User Commands
NAME
cat - concatenate files and print on the standard output
SYNOPSIS
cat [OPTION]... [FILE]...
// ...
Here, we can see how Spanish is used to print the output of the “man man” command. We also learn that since the “man cat” command doesn’t have a Spanish translation, it will use English by default.
3.2. LC_xxx
Now, we’ll have a look at a few LC_xxx variables and how they interact with the new LANG setting for Spanish.
The first of these we’ll look at is LC_TIME, which works with date and time formats. It’s useful, for example, if we relocate from one country to another and we want to adapt our system to that country’s date-time format:
$ date
joi 25 iunie 2020, 22:58:30 +0300
$ export LC_TIME=en_US.UTF-8
$ date
Thu 25 Jun 2020 10:58:30 PM EEST
In the first command, we see the date in Romanian format because we have our LC_TIME variable set to ro_RO.UTF-8. When we change it to US English, we see the same date and time but reported in a different format.
LC_MESSAGES is responsible for printing out messages in a certain language, similar to LANG. Since it’s part of the LC_xxx variable set, it will override the LANG variable. Let’s check:
$ locale | grep -w LANG
LANG=en_US.UTF-8
$ export LC_MESSAGES=de_DE.UTF-8
$ man man
MAN(1) Dienstprogramme für Handbuchseiten
BEZEICHNUNG
man - eine Oberfläche für die System-Referenzhandbücher
ÜBERSICHT
man [man Optionen] [[Abschnitt] Seite ...] ...
// ...
One other locale environment variable of note is LC_NUMERIC, responsible for formatting numbers printed by the OS:
$ env LC_NUMERIC=en_US.UTF8 printf '%f\n' 1233.14
1233.140000
$ env LC_NUMERIC=de_DE.UTF8 printf '%f\n' 1233.14
1233,140000
Here, we see the difference between the two numeric formats in the separator.
The other LC_xxx environment variables are also important and useful in certain scenarios. We just need to know how to use them properly.
3.3. LC_ALL Environment Variable
LC_ALL is the strongest locale environment variable, except for LANGUAGE. It overrides every other variable in priority and is the first to be checked by the system when a locale setting is needed. Thus, it should be used with caution, and only when there are no other solutions to the problem we’re trying to solve.
We usually use this environment variable in scripts or procedures where we don’t want user interference, and we should reset it back to the value it had before when finishing the script execution.
Let’s see what happens to our locale environment variables when we set LC_ALL to English with UTF-8 encoding:
$ export LC_ALL=en_EN.UTF-8 ## setting LC_ALL to English with UTF-8 encoding
$ locale
LANG=es_ES.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_EN.UTF-8"
LC_NUMERIC="en_EN.UTF-8"
LC_TIME="en_EN.UTF-8"
LC_COLLATE="en_EN.UTF-8"
LC_MONETARY="en_EN.UTF-8"
LC_MESSAGES="en_EN.UTF-8"
LC_PAPER="en_EN.UTF-8"
LC_NAME="en_EN.UTF-8"
LC_ADDRESS="en_EN.UTF-8"
LC_TELEPHONE="en_EN.UTF-8"
LC_MEASUREMENT="en_EN.UTF-8"
LC_IDENTIFICATION="en_EN.UTF-8"
LC_ALL=en_EN.UTF-8
Here, we see that the LC_ALL variable has overridden all other locale environment variables. As long as LC_ALL exists, we can’t change our settings:
$ export LC_MESSAGES=de_DE.UTF-8
$ locale
LANG=es_ES.UTF-8
LANGUAGE=en_US
LC_CTYPE="en_EN.UTF-8"
LC_NUMERIC="en_EN.UTF-8"
LC_TIME="en_EN.UTF-8"
LC_COLLATE="en_EN.UTF-8"
LC_MONETARY="en_EN.UTF-8"
LC_MESSAGES="en_EN.UTF-8"
LC_PAPER="en_EN.UTF-8"
LC_NAME="en_EN.UTF-8"
LC_ADDRESS="en_EN.UTF-8"
LC_TELEPHONE="en_EN.UTF-8"
LC_MEASUREMENT="en_EN.UTF-8"
LC_IDENTIFICATION="en_EN.UTF-8"
LC_ALL=en_EN.UTF-8
To reset it, we could export an empty value to LC_ALL:
$ export LC_ALL=
$ locale
LANG=es_ES.UTF-8
LANGUAGE=en_US
LC_CTYPE="es_ES.UTF-8"
LC_NUMERIC=en_EN.UTF-8
LC_TIME=en_EN.UTF-8
LC_COLLATE="es_ES.UTF-8"
LC_MONETARY=en_EN.UTF-8
LC_MESSAGES=de_DE.UTF-8
LC_PAPER=en_EN.UTF-8
LC_NAME=en_EN.UTF-8
LC_ADDRESS=en_EN.UTF-8
LC_TELEPHONE=en_EN.UTF-8
LC_MEASUREMENT=en_EN.UTF-8
LC_IDENTIFICATION=en_EN.UTF-8
LC_ALL=
Here we see that when we reset the LC_ALL environment variable, the locale settings reset to what they were before the export.
3.4. LC_ALL and the sort Command
Setting LC_ALL to the particular value “C” is a simple yet powerful way to force the locale to use the default language while using byte-wise sorting.
Next, let’s see how the LC_ALL variable changes the result of the sort command.
Let’s say we have a letters.txt file:
$ cat letters.txt
b
B
A
c
a
C
D
d
We can sort the file using the sort command:
$ sort letters.txt
a
A
b
B
c
C
d
D
The output above shows the letters are sorted alphabetically.
However, sometimes, we want to sort a file by ASCII code. In that case, we can set the LC_ALL=”C” to force the sorting to be byte-wise:
$ LC_ALL="C" sort letters.txt
A
B
C
D
a
b
c
d
It’s worthwhile to mention that in the command above, the LC_ALL=”C” setting changes the LC_ALL variable only for the sort command execution. The LC_ALL environment variable in the current shell is not changed.
3.5. LANGUAGE
The LANGUAGE environment variable can have one or more language values and is responsible for the order of the languages in which the messages will be displayed. It’s the only environment variable that can be changed when we have the LC_ALL environment variable set.
Let’s see it in action:
$ export LC_ALL=en_US.UTF-8
$ export LANGUAGE=fr_FR:en_EN
$ man man
MAN(1) Utilitaires de l'afficheur des pages de manuel
NOM
man - an interface to the system reference manuals
SYNOPSIS
man [man options] [[section] page ...] ...
// ...
Even though we have LC_ALL set to English, we have some French here and there because there are some existing French translations to the text we are seeing, and since LC_ALL is overridden by LANGUAGE, it shows us that French text.
4. Conclusion
In this article, we learned how to change locale settings, like language, and override them with other environment variables either partially or completely.
We also found out how to set and reset the whole locale settings of a Linux environment and also how the OS works on the priority chain of locale variables when displaying locale-specific messages.