1. Overview
Zipping a file or directory allows us to optimize storage, backup and archive data, and facilitate easy distribution of files.
In this tutorial, we’ll learn to zip each subdirectory of a given directory to a separate zip file. We’ll achieve this with the help of a basic Bash script.
Before we start, let us first understand the problem.
2. Understanding the Problem
Consider we have the following directory structure inside the parent directory /mydir:
.
├── dir_1
│
│ ├── sub_dir_11
│ ├── sub_dir_12
│ └── sub_dir_13
│
├── dir_2
│
│ └── sub_dir_22
│
│ └── sub_dir_22_1
│ ├── sub_dir_22_2
│ └── sub_dir_23_3
│
├── dir_3
│ └── sub_dir_31
│
├── dir_4
│
└── my_file.txt
The goal is to zip all subdirectories inside the parent directory /mydir. Considering the above directory structure, the Bash script should generate four zip files corresponding to dir_1, dir_2, dir_3, and dir_4.
Now that we better understand what we are trying to achieve, let’s look into the Bash script to accomplish this.
3. Using a Shell Script
To achieve the desired result, we can use a simple Bash script to iterate over the primary directory subdirectories and zip each into a unique .zip file. Here’s an example script we can use:
#!/bin/bash
# Change to the primary directory
cd /mydir
# Iterate over subdirectories
for subdir in */; do
# Extract the subdirectory name
dirname=$(basename "$subdir")
# Zip the subdirectory into a unique .zip file
zip -r "$dirname.zip" "$subdir"
done
Here, we are iterating over all the subdirectories in the /mydir directory. Further, we use the zip command to create a zip file. The zip file has the same name as the subdirectory, along with the .zip extension. The -r flag stands for “recursive,” which means it will include all files and subdirectories within the subdirectory in the zip file.
$ ls *zip
dir_1.zip dir_2.zip dir_3.zip dir_4.zip
The above script will generate four zip files – dir_1.zip, dir_2.zip, dir_3.zip and dir_4.zip in the parent directory /mydir:
3.1. Nested Subdirectories
The above script will not create a zip file for any nested subdirectories inside the parent directory /mydir. This means that there will be no zip file corresponding to subdirectories sub_dir_11, sub_dir_12, sub_dir_13, and so on*.*
In order to handle nested subdirectories, we’ll use a recursive approach. Firstly, we’ll use the find command to list all subdirectories. Next, we’ll loop through all the subdirectories to create zip files.
Let’s now look at the updated script that can handle nested subdirectories:
#!/bin/bash
# Change to the primary directory
cd /mydir
# Find all subdirectories (including nested ones) and loop through them
find . -type d -print0 | while IFS= read -r -d '' subdir; do
# Extract the subdirectory name
dirname=$(basename "$subdir")
# Skip the primary directory itself
if [ "$dirname" != "." ]; then
# Zip the subdirectory into a unique .zip file
zip -r "$dirname.zip" "$subdir"
fi
done
The above script uses the find command to locate all subdirectories, including nested ones. It then loops through each subdirectory and creates a zip archive for it. The -type d flag for find will only look for directories. The -print 0 flag will output the directory names using a null character as the delimiter. It ensures that directory names with spaces or special characters are handled correctly.
The while IFS= read -r -d ” subdir; do loop reads each directory name from the null-delimited output of find command. It is assigned to the subdir variable. The IFS= read -r -d ” part is a common idiom to read null-delimited data in Bash.
The script also includes a condition if [ “$dirname” != “.” ];then to skip the primary directory itself, as it’s not necessary to zip the primary directory.
$ ls *zip
dir_1.zip dir_2.zip dir_3.zip dir_4.zip sub_dir_11.zip sub_dir_12.zip sub_dir_13.zip sub_dir_22.zip sub_dir_22_1.zip sub_dir_22_2.zip sub_dir_22_3.zip sub_dir_31.zip
This time, for each subdirectory in /mydir, a separate zip archive is created.
3.2. Parallel Execution
The above scripts work completely fine with less number of subdirectories. But for directories containing too many folders, we’ll introduce parallel execution:
$ for i in */; do zip -0 -r "${i%/}.zip" "$i" & done; wait
This one-liner script will create a zip archive for all the subdirectories. Note that we run the zip command with the -0 option, which specifies “store” compression (no compression) to quickly create zip files without compressing the content. The & at the end of the zip command puts it in the background, allowing multiple subdirectories to be zipped concurrently, potentially speeding up the process.
Finally, the wait command ensures that the script waits for all background zipping processes to finish before proceeding.
4. Conclusion
In this article, we learned how to zip multiple directories in Linux.
First, we discussed a basic script that creates a zip file for all the subdirectories present in the parent directory. Further, we discussed a corner case of handling nested subdirectories. At last, we looked into an optimized solution using parallel execution.