1. Overview
In this tutorial, we’ll look at the method for wiping a file completely from the disk. Specifically, we’ll learn about the shred command-line tool, which allows us to overwrite an existing file with other contents to effectively “shred” it.
2. Why Do We Need to Shred Files?
In Linux, we run the rm command to delete a file. However, the rm command simply removes the pointer to the content without removing the underlying content of the file. This means, if subsequent writes do not use that same block of the disk, it’s entirely possible to recover the content of the file. This can be problematic depending on how sensitive the file in question is.
The technique to shred files in Linux revolves around stuffing the content of the file with other irrelevant contents. In other words, to completely remove the content of the file, we essentially overwrite the file with a series of random bytes or zeros. The shred command is one such tool for the job.
3. The shred Command
The shred command overwrites a file’s content to wipe it off the disk. It does so by writing arbitrary bytes into the file, overwriting the content in the disk, thereby making subsequent retrieval impossible.
3.1. Installation
The shred command is part of the GNU Coreutils package and, therefore, should be in most of the Linux distros. We can quickly check its presence in our system by checking for its version:
$ shred --version
shred (GNU coreutils) 8.30
3.2. Basic Usage
As a minimum, the shred command takes one or more paths of the files to shred as its positional arguments:
shred [OPTION]... FILE...
Then, the shred command will overwrite the file in place with a series of random bytes.
For example, let’s say we have a secret.txt file in our system that contains sensitive information:
$ cat secret.txt
company profit this quarter: $10
We can shred the file by running the shred command:
$ shred secret.txt
Now, if we inspect it, we’ll see that the content of the secret.txt file is incomprehensible:
$ cat secret.txt
hJ��"զ0y?J�☺♥Vh�↕�→�Ƣ����Kn6t�→-�?��
c����H�↓ui�w���^�
(truncated)
Furthermore, we can optionally enable the verbose mode using the -v flag to check on the progress:
$ shred -v secret.txt
shred: secret.txt: pass 1/3 (random)...
shred: secret.txt: pass 2/3 (random)...
shred: secret.txt: pass 3/3 (random)...
From the verbose output, we can see that the shred command is passing the content of our secret.txt file three times, and each time, it overwrites the content with random bytes.
3.3. Overwrite and Remove
By default, the shred command just overwrites the files and keeps them in place. To overwrite and delete the file in a single command, we can pass the -u or the –remove flag:
$ shred --remove secret.txt
$ ls secret.txt
ls: cannot access 'secret.txt': No such file or directory
Note that if we’re running on device files such as /dev/sda, we do not want to pass the –remove flag.
3.4. Configure the Number of Passes
We can change the number of passes of the shred command from 3 to the number we desire using the -n flag.
For instance, let’s overwrite our secret.txt 10 times using the -n flag:
$ shred -v -n 10 secret.txt
shred: secret.txt: pass 1/10 (random)...
shred: secret.txt: pass 2/10 (000000)...
shred: secret.txt: pass 3/10 (aaaaaa)...
shred: secret.txt: pass 4/10 (ffffff)...
shred: secret.txt: pass 5/10 (b6db6d)...
shred: secret.txt: pass 6/10 (random)...
shred: secret.txt: pass 7/10 (db6db6)...
shred: secret.txt: pass 8/10 (924924)...
shred: secret.txt: pass 9/10 (555555)...
shred: secret.txt: pass 10/10 (random)...
Theoretically, the more times we overwrite our file, the harder it is to recover the content of the file after we delete it.
3.5. Adding a Zero Overwrite at the End
The shred command has the -z option, which adds an overwrite with zeros at the final pass. This can help in hiding the shredding activities from any inspection:
$ shred -v -z secret.txt
shred: secret.txt: pass 1/4 (random)...
shred: secret.txt: pass 2/4 (random)...
shred: secret.txt: pass 3/4 (random)...
shred: secret.txt: pass 4/4 (000000)...
Note that the final overwrite with zero bytes doesn’t count toward the number of iterations. In other words, specifying the -z option will overwrite the file N + 1 times, where N is the number of iterations configured.
4. Recursively Shred Files
So far, the examples we’ve seen are shredding a single file by file path. What if we want to shred the entire directory?
Let’s say there’s an entire directory that we want to shred:
$ tree secrets
secrets
|-- engineering
| |-- critical-system-design.txt
| `-- possible-zero-day-bug.txt
`-- finance
|-- accounting.txt
`-- financial-result-q4.txt
2 directories, 4 files
Using just the shred command alone, we have no choice but to issue multiple shred commands to shred those files multiple layers inside a directory.
Fortunately, we can use the find command to recursively walk the directory and run the shred command for each of the matching files. For example, we can recursively shred all the files within the secrets directory:
$ find secrets -type f -exec shred {} \;
In our example above, the find command recursively walks the secrets directory and executes the shred command for each of the files it finds.
5. Word of Caution
The guarantee made by the shred command lies on a very important assumption: that the underlying filesystem overwrites the data in place. The implication is that for any filesystems that do not overwrite the data in place, the guarantee of the shred command might not hold, and recovery would be possible. These filesystems include:
- log-structured or journaled filesystems
- filesystems with redundant data setups, such as RAID
- filesystems that make snapshots
- filesystems that cache in temporary locations
- compressed filesystems
Besides that, some optimization mechanisms in flash-based storage such as solid-state drives also make the shred command ineffective.
5.1. Log-Structured or Journaled Filesystems
A log-structured or journaled filesystem does not rewrite the original block in place because doing so would prevent it from recovering in case of a crash. Given that the shred command relies on the assumption that subsequent writes to the same file affect the same block(s), this makes it ineffective against log-structured and journaled filesystems.
In Linux, the ext3 filesystem is susceptible to this caveat if it’s operating in data=journal mode. In data=ordered and data=writeback modes, the shred command works as expected.
5.2. Flash Storage Wear-Leveling Mechanism
For flash storage such as the solid-state drive (SSD), the shred command could be ineffective. This is because of the wear-leveling mechanism in most flash storage controllers.
To understand the mechanism, first, we have to understand that flash storage can wear out very quickly if it’s overwritten too many times. Therefore, to optimize the distribution of writes across the different blocks in the same storage, the controller will usually implement a wear-leveling mechanism.
The wear-leveling mechanism tries to spread out the writes to a different part of the disk so that each sector is equally used, thus prolonging the lifetime of a drive. Consequently, this makes it difficult to guarantee that the subsequent write on the same file will affect the same block of the drive, therefore, the shred is considered to be ineffective.
6. Conclusion
In this tutorial, we first learned that the rm command only removes the pointer to the file, and recovery of the underlying data is possible. Then, we introduced the shred command as a way to garble up the content as a means to shred it. Furthermore, we’ve seen several options for the shred command that allow us to customize its behavior.
Finally, we’ve seen how the guarantee made by the shred command doesn’t hold if the underlying filesystem or storage mechanism does not overwrite the data in place.