1. Overview

In this article, we will be covering the environment variables of the Linux system, and what rules we need to be aware of if we want to create new or modify the existing ones. This article will focus on the syntax of the environment variables.

2. Environment Variables

To start our discussion, it is better to first look at the environment variables on the Linux system. To do so, we can type the printenv command to see them:

$ printenv
SHELL=/bin/bash
SESSION_MANAGER=local/username-VirtualBox:@/tmp/.ICE-unix/1644,unix/username-VirtualBox:/tmp/.ICE-unix/1644
QT_ACCESSIBILITY=1
COLORTERM=truecolor
...
PATH=/home/username/anaconda3/bin:/home/username/anaconda3/condabin:/home/username/.local/bin:...
GDMSESSION=ubuntu
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
_=/usr/bin/printenv

The output above shows the list of environmental variables on our current machine login session. Here, we can easily find a pattern of NAME=VALUE. Therefore, we need to be careful not to use equal sign (*=*) character in naming our environment variables.

3. Environment Variable Definitions

From the documentation the Chapter 8 of The Open Group Base Specifications (A.K.A. POSIX Regulation) released by IEEE and The Open Group, we can find the general definitions of the constitution of the environment variables. Besides the use of the equal sign (*=*) character, there are some other general regulations on the available characters.

3.1. Use Portable Characters

To make sure that our program works on all machines, we need to use the characters from the Portable Character Set defined in Chapter 6 of Open Group Base Specifications (except NUL). These characters are defined by the POSIX.1-2017 and are always available among Linux systems that have been installed correctly.

Only uppercase letters, lowercase letters, and underscores from this character set are allowed.

3.2. Be Aware of the Cases

The system environment variables consist of uppercase letters, digits, and the underscore (“*_*“). Yet we can still define the environment variables with lowercase letters.  Also, letter cases stand for different meanings, so we don’t want to fold the case together.

It is a convention that lowercase letters are reserved for applications only.

3.3. Don’t Start With a Digit

Some applications cannot cope with environment variables that begin with a digit. Unexpected behavior may occur if we define these variables in this way.

Both the POSIX document and we do not recommend creating environment variables that start with such digits anywhere.

3.4. Variable Name Conflicts

The following table shows the variables we need to avoid conflicting, and most of them are system-defined and serve some special purposes.

The table of keywords is here:

ARFLAGS

IFS

MAILPATH

PS1

CC

LANG

MAILRC

PS2

CDPATH

LC_ALL

MAKEFLAGS

PS3

CFLAGS

LC_COLLATE

MAKESHELL

PS4

CHARSET

LC_CTYPE

MANPATH

PWD

COLUMNS

LC_MESSAGES

MBOX

RANDOM

DATEMSK

LC_MONETARY

MORE

SECONDS

DEAD

LC_NUMERIC

MSGVERB

SHELL

>EDITOR

LC_TIME

NLSPATH

TERM

ENV

LDFLAGS

NPROC

TERMCAP

EXINIT

LEX

OLDPWD

TERMINFO

FC

LFLAGS

OPTARG

TMPDIR

FCEDIT

LINENO

OPTERR

TZ

FFLAGS

LINES

OPTIND

USER

GET

LISTER

PAGER

VISUAL

GFLAGS

LOGNAME

PATH

YACC

HISTFILE

LPDEST

PPID

YFLAGS

HISTORY

MAIL

PRINTER

HISTSIZE

MAILCHECK

PROCLANG

HOME

MAILER

PROJECTDIR

The system calls these variables very frequently. Thus, a conflict with them may cause serious errors.

4. Conclusion

In this tutorial, we have covered the allowed characters allowed on the Linux environment variables. As the POSIX regulation has already told us, we can define a new environment variable in the following ways:

  • [A-Z_]{1,}[A-Z0-9_]*, if we want to define an environment variable that is reserved for the operating system and shell
  • [a-zA-Z_]{1,}[a-zA-Z0-9_]*, if we want to define an environment variable for an application only (keep at least one lowercase letter)