1. Overview
As Linux users, we frequently perform various operations on files. For instance, one common operation is to create a file of a certain size. This helps when testing, structuring data, and ensuring proper alignment.
In this tutorial, we’ll first discuss various ways to create files with a specific size. After that, we explore methods to pad an existing file.
2. Using the fallocate Command
fallocate is a simple command to allocate storage space by creating a file.
For example, let’s create a file of 100 MiB:
$ fallocate -l 100M file1
$ ls -lh file
-rw-rw-r-- 1 groot groot 100M May 15 20:26 file
In this case, we’re using the -l argument to represent the length of the file in bytes.
The fallocate command also accepts sizes in human-readable formats like Kilobytes (K), Megabytes (M), and Gigabytes (G).
3. Using the truncate Command
The truncate command can extend or shrink a file to a given size.
Let’s use it to create a file of 200 MiB:
$ truncate -s 200M file
$ ls -lh file
-rw-rw-r-- 1 groot groot 200M May 15 20:36 file
Specifically, we use the -s argument to represent the size of the file in bytes.
4. Using the head and tail Commands
The head command in combination with the /dev/zero file can create files filled with a set number of ASCII NUL characters:
$ head --bytes 1G /dev/zero > file
$ ls -lh file
-rw-rw-r-- 1 groot groot 1G May 15 20:47 filet
In this case, the –bytes option represents the desired file size in bytes.
Similarly, the tail command can work in the same way:
$ tail --bytes 1G /dev/zero > file
$ ls -lh file
-rw-rw-r-- 1 groot groot 1.0G May 15 20:52 file
Thus, we produce a file of exactly 1G in both cases.
5. Using the dd Command
The dd command can convert and copy files.
Let’s use dd to create a file of 10 MiB:
$ dd if=/dev/zero of=file bs=1M count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0387031 s, 271 MB/s
$ ls -lh file
-rw-rw-r-- 1 groot groot 10M May 15 20:58 file
Here, we used several arguments:
- if is the input file
- of is the output file
- bs is the block size in bytes
- count is the number of blocks to copy
Consequently, we verify the file size of the output is as expected.
6. Padding Existing File
Padding is the process of artificially expanding a given structure to a predefined size. Examples include zero padding a number and algorithms like Base64 and various encryption techniques.
Importantly, we may sometimes need a given file to be of a certain size. This may be necessary for demonstrations, testing, or to adhere to certain requirements. File padding can be physical with actual storage writes or logical, resulting in logically-allocated sparse files.
Critically, padding may make a file unreadable depending on the way we use it after the fact.
Let’s check some tools and methods to perform file padding.
6.1. Using fallocate or truncate
The truncate and fallocate tools can also pad existing files in addition to creating new ones. Importantly, the main difference between the two in this regard is the sparseness of the resulting file.
Assuming file already exists with a size of 200 MiB, we can increase its size by padding it to 666 MiB:
$ ls -lh file
-rw-rw-r-- 1 groot groot 200M May 15 20:36 file
$ truncate -s 666M file
$ ls -lh file
-rw-rw-r-- 1 groot groot 666May 15 20:36 file
Notably, if the file exists and is smaller than what is supplied to the -s option, truncate and fallocate increase the file size to the requested amount with ASCII NUL bytes**. If the existing file is bigger, only truncate reduces it to the desired size**. However, this may result in data loss or data corruption.
Alternatively, we can pad with actual data via fallocate:
$ fallocate -l 666M file
In any case, the first 200 MiB in the final file are from the original, while the rest are NUL.
6.2. Using dd
Naturally, we can perform some calculations to pad the end of a file via dd:
$ dd if=/dev/zero of=file bs=1 oseek=$FILESIZE count=$(( $PADDEDSIZE - $FILESIZE ))
Here, we need to have the initial size of file in $FILESIZE and the desired resulting size in $PADDEDSIZE. This way, we calculate the padding as $(( $PADDEDSIZE – $FILESIZE )) and add it to the end of the file by first using oseek to go there. The padding itself is NUL characters that come from /dev/zero.
To avoid the calculations, we can use another method:
$ dd if=/dev/zero of=largerfile bs=1M count=666
$ dd if=file of=largerfile bs=1M conv=notrunc
In this case, we first create largerfile with a size of 666 MiB. Notably, this has to be bigger than file. After that, we write our initial file to the beginning of largerfile. Thus, we end up with the contents of file padded to the size of largerfile.
This has the added benefit of preserving the original.
7. Conclusion
In this tutorial, we discussed five practical methods to create a file of a certain size or pad it with zeroes up to a given amount.
In conclusion, each of them produces similar results and can be applied according to our preference for sparseness, commands, and methodology.