1. Overview
In this article, we will be covering the environment variables of the Linux system, and what rules we need to be aware of if we want to create new or modify the existing ones. This article will focus on the syntax of the environment variables.
2. Environment Variables
To start our discussion, it is better to first look at the environment variables on the Linux system. To do so, we can type the printenv command to see them:
$ printenv
SHELL=/bin/bash
SESSION_MANAGER=local/username-VirtualBox:@/tmp/.ICE-unix/1644,unix/username-VirtualBox:/tmp/.ICE-unix/1644
QT_ACCESSIBILITY=1
COLORTERM=truecolor
...
PATH=/home/username/anaconda3/bin:/home/username/anaconda3/condabin:/home/username/.local/bin:...
GDMSESSION=ubuntu
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
_=/usr/bin/printenv
The output above shows the list of environmental variables on our current machine login session. Here, we can easily find a pattern of NAME=VALUE. Therefore, we need to be careful not to use equal sign (*“=“*) character in naming our environment variables.
3. Environment Variable Definitions
From the documentation the Chapter 8 of The Open Group Base Specifications (A.K.A. POSIX Regulation) released by IEEE and The Open Group, we can find the general definitions of the constitution of the environment variables. Besides the use of the equal sign (*“=“*) character, there are some other general regulations on the available characters.
3.1. Use Portable Characters
To make sure that our program works on all machines, we need to use the characters from the Portable Character Set defined in Chapter 6 of Open Group Base Specifications (except NUL). These characters are defined by the POSIX.1-2017 and are always available among Linux systems that have been installed correctly.
Only uppercase letters, lowercase letters, and underscores from this character set are allowed.
3.2. Be Aware of the Cases
The system environment variables consist of uppercase letters, digits, and the underscore (“*_*“). Yet we can still define the environment variables with lowercase letters. Also, letter cases stand for different meanings, so we don’t want to fold the case together.
It is a convention that lowercase letters are reserved for applications only.
3.3. Don’t Start With a Digit
Some applications cannot cope with environment variables that begin with a digit. Unexpected behavior may occur if we define these variables in this way.
Both the POSIX document and we do not recommend creating environment variables that start with such digits anywhere.
3.4. Variable Name Conflicts
The following table shows the variables we need to avoid conflicting, and most of them are system-defined and serve some special purposes.
The table of keywords is here:
ARFLAGS
IFS
MAILPATH
PS1
CC
LANG
MAILRC
PS2
CDPATH
LC_ALL
MAKEFLAGS
PS3
CFLAGS
LC_COLLATE
MAKESHELL
PS4
CHARSET
LC_CTYPE
MANPATH
PWD
COLUMNS
LC_MESSAGES
MBOX
RANDOM
DATEMSK
LC_MONETARY
MORE
SECONDS
DEAD
LC_NUMERIC
MSGVERB
SHELL
>EDITOR
LC_TIME
NLSPATH
TERM
ENV
LDFLAGS
NPROC
TERMCAP
EXINIT
LEX
OLDPWD
TERMINFO
FC
LFLAGS
OPTARG
TMPDIR
FCEDIT
LINENO
OPTERR
TZ
FFLAGS
LINES
OPTIND
USER
GET
LISTER
PAGER
VISUAL
GFLAGS
LOGNAME
PATH
YACC
HISTFILE
LPDEST
PPID
YFLAGS
HISTORY
PRINTER
HISTSIZE
MAILCHECK
PROCLANG
HOME
MAILER
PROJECTDIR
The system calls these variables very frequently. Thus, a conflict with them may cause serious errors.
4. Conclusion
In this tutorial, we have covered the allowed characters allowed on the Linux environment variables. As the POSIX regulation has already told us, we can define a new environment variable in the following ways:
- [A-Z_]{1,}[A-Z0-9_]*, if we want to define an environment variable that is reserved for the operating system and shell
- [a-zA-Z_]{1,}[a-zA-Z0-9_]*, if we want to define an environment variable for an application only (keep at least one lowercase letter)