1. Overview

In this tutorial, we’ll look at the “argument list too long” problem, often encountered while working with a large number of files. First, we’ll discuss what’s causing it. Then, we’ll discuss a few solutions that will help us to solve this issue.

2. What Causes the Error

Let’s consider a case where we have a large number of files residing within a directory:

$ ls -lrt | wc -l
230086
$ ls -lrt | tail -5
-rw-r--r-- 1 shubh shubh 0 Apr 30 14:02 events2120038.log
-rw-r--r-- 1 shubh shubh 0 Apr 30 14:02 events2120040.log
-rw-r--r-- 1 shubh shubh 0 Apr 30 14:02 events2120039.log
-rw-r--r-- 1 shubh shubh 0 Apr 30 14:02 events2120042.log
-rw-r--r-- 1 shubh shubh 0 Apr 30 14:02 events2120041.log

Here, we have over 230K log files in our directory. Let’s try to get the count of all filenames that start with the string ‘events’:

$ ls -lrt events* | wc -l
-bash: /usr/bin/ls: Argument list too long
0

Notably, the command fails, citing “Argument list too long” as the reason. Let’s try the rm command to get rid of these files:

$ rm -rf events*.log
-bash: /usr/bin/rm: Argument list too long

Again, the command fails for the same reason.

While performing filename expansion, Bash expands the asterisk (*) with every matching file. In effect, this produces a very long list of command-line arguments that Bash isn’t able to handle.

When the number of files to be expanded as arguments is larger than the arguments buffer space, Bash fails to handle it. Note that this buffer is shared with the environment space info, so the real available space is smaller than this buffer size.

The rm command in the previous example expands to:

$ rm -rf events2120038.log events2120040.log ... events0000001.log

Here, the argument list becomes equal to the number of files in the directory. In our case, this is over 230K files, which makes for a lot of arguments. We can utilize the getconf command to get the current system limits:

$ getconf ARG_MAX
2097152

The ARG_MAX argument controls the maximum space requirements for the exec family of functions. This helps the kernel to determine the largest buffer it needs to allocate. These limits can also be verified using the xargs command:

$ xargs --show-limits
Your environment variables take up 2504 bytes
POSIX upper limit on argument length (this system): 2092600
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2090096
Size of command buffer we are actually using: 131072
Maximum parallelism (--max-procs must be no greater): 2147483647

The information of prime interest here is the ‘upper limit on argument length’, which may vary from system to system.

3. Overcoming the Limitation

Let’s dive into various approaches we can utilize to solve this problem. What all the proposed solutions have in common is they avoid parameter expansion.

3.1. Using the find Command

We can iterate on the list of files using the find command and then use either the exec option or the xargs command:

$ find . -iname "events*" | xargs ls -lrt | wc -l
230085

First, we fetch the list of all files starting with the word “events” using the find command. Then, we use the xargs command to accept the list of files from stdin, and finally, we execute the ls and wc commands over the list of files provided by xargs.

3.2. Using the for Loop Approach

Another interesting approach is to iterate on the files using the for loop:

$ for f in events*; do echo "$f"; done | wc -l
230085

This is one of the simplest techniques to solve the issue. Note that this solution can be a bit slower, though.

3.3. Manual Split

We can split the files into smaller bunches and execute the commands (such as rm, cp, mv, wc, ls) repeatedly with a different set of strings as arguments each time:

$ ls -lrt events1*.log | wc -l
31154
$ ls -lrt events2*.log | wc -l 
15941

Here, we’re filtering only the file names starting with “events1“. In this particular example, we stay within the space requirements controlled by the ARG_MAX value.

Then, we do the same with those starting with “events2“, and so on.

3.4. When We Just Need to Remove the Content of a Directory

Consider a case where we are trying to get rid of all files in a directory and it fails:

 $ rm -rf *
 -bash: /usr/bin/rm: Argument list too long

To tackle this problem, we can alternatively just delete the directory and create it again:

$ rm -rf /home/shubh/tempdir/logs_archive
$ cd home/shubh/tempdir && mkdir logs_archive 

In this case, the logs_archive directory contained the files we wanted to delete.

Note that since we’re deleting the directory and creating it again, this approach won’t preserve the original permissions or ownership of the directory.

4. Conclusion

In this tutorial, we looked at multiple techniques to address the “argument list too long” issue.

First, we discussed what’s causing this error. Then, we learned various solutions that can be utilized to solve the problem both in general and in particular cases.