1. Introduction

The digital world presents many threats. Among the most common threats, we can highlight virusestrojansdenial-of-service attacksman-in-the-middle attacks, etc. In general, these threats have different objectives, such as changing system configurations, making a system unavailable, or stealing data.

However, there exist other threats that typically aim to slow down or crash a system, thus making it more vulnerable or increasing the damage caused by other attacks.

Zip bombs are examples of the previously mentioned kind of threats. A zip bomb, also known as a decompression bomb or zip of death, is a malicious file that exploits a characteristic of the zip compressor to crash a system that processes it.

In this tutorial, we’ll study zip bombs, understanding how they are both created and employed in particular attacks.

First, we’ll have a brief review of the zip compression with the basics of its algorithm. So, we’ll see the fundamentals of a zip bomb and how it works. At last, we’ll have an overview of how attackers really use zip bombs.

2. The Zip Compression

First of all, compressing a file consists of re-encoding it. This new encoding, in turn, aims to reduce the number of bytes required to express the set of data into the file. In short, we compress files to reduce their size.

2.1. Basics of Compression

The result of compression depends on two relevant factors: the compression method and the file’s entropy. First, let’s talk about entropy. 

We can see entropy, in a high-level view, as a property related to the frequency and organization of data. The higher the entropy is, the more heterogeneous and unpredictable a data set is. Thus, it is hard to find data patterns in high entropy files.

On the contrary, low entropy files usually present more homogeneous data and easily recognizable patterns.

Most compressors get good compression ratios when dealing with low entropy files. The big challenge, in turn, consists of employing methods that result in a good compression ratio for high entropy files.

In this way, several different compression methods tackle this entropy challenge in different ways. Some methods prefer to explore the data frequency, others focus on data organization, and others concentrate on particular features of a specific file type.

A compressor can employ single or multiple compression methods. Furthermore, it is usual for compressors to use entropy reduction techniques too. These techniques primarily aim to restructure data to create convenient data patterns, allowing for better compression ratios.

2.2. The Zip Compression Method

The very first influence of current zip-based compressors is the Lampel-Ziv method. This method, typically called LZ, uses a dictionary iteratively created considering a sliding window.

The compressing process of LZ iterates over the window and searches for repeated sequences in the dictionary. Each iteration may also result in an update of the dictionary.

If the compressor finds a repeated sequence, it replaces the original data with two numbers: offset (the starting point of the first found sequence in the dictionary) and length (the size of the matching sequence from the offset).

The following pseudo-code presents the LZ method:

algorithm LZCompression(input):
    // INPUT
    //    input = Bytes to be compressed
    // OUTPUT
    //    The input bytes compressed using the LZ compression algorithm

    Create an empty dictionary
    P <- the first byte of the input
    Insert P in the dictionary with code 0

    for C in input, except the first byte:
        if P + C in the dictionary:
            P <- P + C
        else:
            if P not in the dictionary:
                Insert P into the dictionary with the next code

            Write the code for P in the compressed file
            Insert P + C to the dictionary with the next code
            P <- C

    Write the code for P in the compressed file

We can note that LZ compression gets a particularly good compression ratio when the input file has low entropy, repeating data patterns several times. So, let’s see an example of this case using an LZ compressor with a 10-bytes sliding window:

LZ

The zip compressor executes extra compression techniques than LZ and follows the DEFLATE compression standard. But the most relevant for us here is to see that a large file with very low entropy can generate a very small compressed zip file.

3. The Zip Bomb Attack

The central idea of zip bomb attacks is to exploit the characteristics of the zip compressor and its techniques to create small and easy-to-transport zip files. However, these files require many computational resources (time, processing, memory, or disk) to uncompress.

The most common objective of a zip bomb is rapidly consuming the available computer memory in a relatively CPU-intensive process. In such a way, the attacker expects that the computer victim of a zip bomb crashes at some point.

However, attackers may design zip bombs to exploit other characteristics of software installed on the victim’s computer. For example, some zip bombs aim to crash file systems without consuming all the computer’s memory.

We’ll explore zip bomb categories and processes used to execute a zip bomb attack in the following subsections.

3.1. The File

The first aspect of zip bomb attacks is how the attacker structures the malicious file. Actually, there are several ways to create a zip bomb file.

Each way of creating zip bomb files considers specific aspects of zip compression and compressors to get the smallest files with the highest destructive power. Let’s see some categories of these files:

  • Multi-layered: a zip file containing multiple layers of compressed data. It means that, into a single zip file, there are recursive zip files that finally contain a large but low entropy data file. The most know zip bomb of such category is the 42.zip
  • Single-layered: file includes a set of large and low entropy data in a single zip file. Attackers carefully design the set of data files to achieve the best compression ratio of the zip compressor. Famous examples are zbsm, zblg, zbxl
  • Self-replicating: this is the most complex zip bomb. It is a zip file that replicates itself when decoded, creating a recursive process. So, these bombs require data files with specific features to work. A known example of such a category is the r.zip file

Zip bombs of different categories require specific exploits and conditions to be used in an attack. Thus, we’ll study typical scenarios for a zip bomb attack in the following subsection.

3.2. The Attack

A zip bomb is not enough for an attacker to harm a computer system. To achieve some practical result, the attacker must find a way to “explode” the bomb in the victim’s computer.

We will focus on three usual forms to execute a zip bomb attack – tricking the victim, exploiting the behavior of previously installed software, and infecting the victim with malware.

At first: tricking the victim. This method consists of convincing the victim to unzip the bomb. In such a way, the attacker can include the bomb together with another desired and non-malicious file, creating a kind of trojan zip bomb.

Another possibility is using social engineering to make the victim believe that the zip bomb is not a malicious file and voluntarily decompress the file.

Second: exploiting the behavior of previously installed software. We can see this method as one of the most effective forms to “explode” a zip bomb. It relies on programs that immediately uncompress files when provided a compressed one.

An example of these programs are antiviruses that uncompress files to scan the content, searching for malware. Similarly, some browsers immediately uncompress files to show their content.

So, if these programs exist on the victim’s computer and are conveniently configured, downloading a zip bomb can instantly trigger the attack.

At last: infecting the victim with malware. The third method includes infecting the victim’s system with malware that will trigger the attack.

The malware, for instance, can start a recursive uncompress for every downloaded compressed file. Alternatively, it can search and decompress every compressed file in the computer system.

Finally, some categories of zip bombs do not fit well with particular attack methods. For example, for the multi-layered bombs, tricking the victim may not be adequate since there are multiple compressed files to uncompress.

4. Conclusion

In this tutorial, we studied the zip bombs. First, we had a brief review on data compression and specifically investigated the Lampel-Ziv compression method, the base algorithm used by zip compressors. Finally, we learned about zip bombs. In this context, we understood how a zip bomb file is created and the methods used to “explode” it in the victim’s system, executing an attack.

We can conclude that zip files are not malicious by themselves. Actually, zip compressors represented a relevant evolution in information theory and data compression fields.

However, particular features of the zip compressors are exploited by attackers to create zip bombs. So, these attackers found a secondary use for the zip compressor, a malicious one which became a concern in the security community.