1. Introduction

Variables are the main containers of data in computer systems. As such, depending on the particular case, their size determines a number of factors about the context of their use.

In this tutorial, we’ll explore environmental constraints and the maximum size of variables. First, we talk about variables and variable types. After that, we turn to different context constraints and their roots. Finally, we test practical ways to find out the environment variable sizes.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments unless otherwise specified.

2. Variables and Variable Types

Variables are units of data. Depending on the context, a variable may have a type. Excluding the way they might be enforced by a compiler or interpreter, one of the main differences between variable types comes down to size.

Regardless of their specifications, computers have limited resources. Thus, because of the hardware constraints, infinite sizes are hypothetical in relation to software as well. This is part of the reason why variables often have predetermined sizes or at least size limits.

Another reason stems from the fact that the stack and heap both need a way to place boundaries between variables. This promotes proper memory addressing and reduces resource usage by limiting the size of a variable based on rules instead of the total memory edge.

In Bash, variables act like in some other interpreted languages:

  • no type: can hold strings, integers, and any other value type
  • no declaration: no need to initialize a variable in advance

In essence, Bash allows the use of any variable whenever and however we need, even if it hasn’t been assigned a value. This is especially important for environment variables, common to most processes within a given shell, even though they’re just like any other variable in terms of size.

So, how do we find out the size constraints of a variable in Bash?

3. Context Constraints

Naturally, many factors influence the maximum size of shell objects. Let’s explore them first.

3.1. Maximum Variable Size

Although Bash itself doesn’t really set definite contraints, we can deduce the maximum shell variable size in several ways.

In UNIX-like systems like Linux, processes begin life by forking and are run via exec*().

The latter family of functions imposes a limit on two strings that the process receives:

  • argv: command-line argument
  • envp: environment, including environment variables

Particularly, POSIX dictates that the static or dynamic ARG_MAX constant is to hold the limit for both. In practice, this means different things depending on the kernel version.

Before Linux kernel version 2.6.23, 32 pages of size MAX_ARG_PAGES was the maximum. Actually, this often came out to 32*4kB=128kB, depending on the architecture.

On and after Linux kernel version 2.6.23, memory management units use several constraints:

  • maximum 1/4 of the RLIMIT_STACK limit at the time of the exec*() call
  • maximum 3/4 of the _STK_LIM internal kernel constant (8 MiB)
  • minimum 32 pages (since 2.6.23)

So, the total environment variable size is limited based on these rules.

Let’s see what ARG_MAX is for our particular system:

$ getconf -a | grep ARG_MAX
ARG_MAX                            2097152
_POSIX_ARG_MAX                     2097152

Here, we use getconf to get [-a]ll configuration variables of the system and grep to filter only for ARG_MAX. This results in the POSIX and regular variants of the variable, both of which are 2097152 bytes, 2048KB, i.e., 2MB.

However, there is a separate limit per string, which is MAX_ARG_STRLEN (32 pages). Finally, the maximum allowed number of strings as separate units is 2147483647, or around 31 bits.

3.2. Environment Constraints

Another way to get data about the environment constraints is the true command in a pipeline with xargs:

$ true | xargs --show-limits
Your environment variables take up 2067 bytes
POSIX upper limit on argument length (this system): 2093037
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2090970
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647

Here, we first get the current environment variable usage of 2067 bytes. Importantly, we also see the difference between the actual and maximum length of the command buffer.

Of course, we can see the earlier upper limit of 2093037 bytes. Further, the smallest allowed maximum argument length is 4096 bytes.

Critically, even if we fill our environment near the upper limit, any command-line arguments passed to a child process of the same shell might cause an Argument list too long error.

4. Current Environment Variable Size

In practice, we can use manual ways to check the current environment variable size. Depending on our needs, we can use a broader or a more refined method. In particular, we employ scripts to allocate specific-sized variables.

Importantly, we should ensure several factors are in place:

  • current shell doesn’t have heavy allocations
  • system has enough free RAM and swap space

Further, we expect the variable length to cause issues before we reach the total environment size constraints due to the higher maximum allowed size of the latter after kernel version 2.6.35:

$ uname -r
5.10.0-666-amd64

Notably, uname -r shows our kernel to be a later one. Still, other environmental factors may also influence our tests.

4.1. Indicative Values

To begin with, let’s create the maxvar.sh Bash shell script and check its contents via cat:

$ cat maxvar.sh
#!/usr/bin/env bash
var='.'
while true
do
  echo "$(date) $(numfmt --to=iec-i --suffix=B --padding=7 ${#var})" 
  var=$var$var
done

In this script, we start by creating a variable $var containing a single . dot character. After that, we create an infinite while loop with true. On each iteration, we show the date and time for reference and append the current size of $var formatted with numfmt:

  • –to**=iec-i uses automatic scaling based on the iec-i unit
  • –suffix**=B adds the B suffix to the Ki, Mi, and further unit names
  • –padding**=7 pads to 7 characters with spaces

Also, we double the variable size by appending $var to itself.

This way, we get the length of the variable $var via ${#var} in a human-readable format.

Now, let’s run the script to check our current environment:

$ bash maxvar.sh
Fri Aug 18 08:18:28 AM EDT 2023      2B
Fri Aug 18 08:18:28 AM EDT 2023      4B
Fri Aug 18 08:18:28 AM EDT 2023      8B
Fri Aug 18 08:18:28 AM EDT 2023     16B
Fri Aug 18 08:18:28 AM EDT 2023     32B
Fri Aug 18 08:18:28 AM EDT 2023     64B
Fri Aug 18 08:18:28 AM EDT 2023    128B
Fri Aug 18 08:18:28 AM EDT 2023    256B
Fri Aug 18 08:18:28 AM EDT 2023    512B
Fri Aug 18 08:18:28 AM EDT 2023  1.0KiB
Fri Aug 18 08:18:28 AM EDT 2023  2.0KiB
Fri Aug 18 08:18:28 AM EDT 2023  4.0KiB
Fri Aug 18 08:18:28 AM EDT 2023  8.0KiB
Fri Aug 18 08:18:28 AM EDT 2023   16KiB
Fri Aug 18 08:18:28 AM EDT 2023   32KiB
Fri Aug 18 08:18:28 AM EDT 2023   64KiB
Fri Aug 18 08:18:28 AM EDT 2023  128KiB
Fri Aug 18 08:18:28 AM EDT 2023  256KiB
Fri Aug 18 08:18:28 AM EDT 2023  512KiB
Fri Aug 18 08:18:28 AM EDT 2023  1.0MiB
Fri Aug 18 08:18:28 AM EDT 2023  2.0MiB
Fri Aug 18 08:18:28 AM EDT 2023  4.0MiB
Fri Aug 18 08:18:28 AM EDT 2023  8.0MiB
Fri Aug 18 08:18:28 AM EDT 2023   16MiB
Fri Aug 18 08:18:28 AM EDT 2023   32MiB
Fri Aug 18 08:18:28 AM EDT 2023   64MiB
Fri Aug 18 08:18:29 AM EDT 2023  128MiB
Fri Aug 18 08:18:31 AM EDT 2023  256MiB
Thu Aug 18 08:18:57 AM EDT 2023  512MiB
Killed

Notably, Killed indicates that the kernel prevented the process from allocating the given amount of memory by killing it. The overload happened when trying to allocate 1GB of memory for a single variable.

4.2. Precise Size

At this point, to find out a more exact value, we can generate specific-length versions of the variable in a new script, maxvarfine.sh:

$ cat maxvarfine.sh
#!/usr/bin/env bash
for i in {6..10}
do
  echo "$(date) $(numfmt --to=iec-i --suffix=B --padding=7 ${#var})" 
  var=$(tr --delete --complement A-Za-z0-9 </dev/urandom | head --bytes=$((i*100))MiB)
done

To do that, we create a for loop that iterates within the number range {6..10}. Again, we print the current $var size with the date on each iteration. However, when reassigning $var, we use tr to filter redirected input from /dev/urandom. We do so by performing a –delete (-d) on the –complement (-c) of the basic alphanumeric character range, i.e., anything that’s not a Latin letter or a number. This way, head gets a given number of –bytes (-c) consisting of only random alphanumeric characters.

Let’s now check the results:

$ bash maxvarfine.sh
Fri Aug 18 08:18:31 AM EDT 2023  0B
Thu Aug 18 08:19:15 AM EDT 2023  600MiB
Thu Aug 18 08:24:33 AM EDT 2023  700MiB
Killed

Although also a bit crude by default, this method can be as refined as we need it to be.

5. Summary

In this article, we talked about context size constraints and environment variable constraints in particular.

In conclusion, limits vary according to many factors, but we do have ways to check the current limits of our environment in theory and practice.