What is a .zip bomb and how it works?

1.90K views

What is a .zip bomb and how it works?

In: Technology

6 Answers

Anonymous 0 Comments

A .zip bomb is designed to take a really long time to uncompress, either by containing a lot of compressed data or endlessly copying itself. The idea is for the bomb to either take up all of the computer’s processing power or to sneak in dirty files during uncompression, and both routes allow computer viruses to infect your computer.

Anonymous 0 Comments

imagine that you’re packing up for a trip, you compress all of your clothes in a small bag

you successfully zip it close, but when you arrive at your destination, you open your bag, everything comes out, a shit ton of clothes scattered on an area

zip bombs are, well, compressed files, usually just a single text file with at least 1 byte of data or more, then compressed, made duplicates, compressed, duplicate and so on until you’re left with 1 zip file that when unzipped, consumes a lot of storage space

Anonymous 0 Comments

A zip bomb is just a zip file that contains very well compressed stuff, and typically contains more zip files. Getting a 1000-to-1 compression ratio (that is, 1 megabyte ZIP file expands into 1000 megabytes of files) is doable, and would be a good first step in building a zip bomb. Do this a few times, put a few copies of the resulting ZIP file into a ZIP file, maybe ZIP that up, and so on. If you were to extract the ZIP file the RAM and/or disk usage would be shockingly high.

There is a well known zip bomb named 42.zip, known for being about 42 kilobytes, and decompressing into terabytes of data when all its layers are fully unpacked. There is also a proof of concept zip file that contains 2 files: a picture, and then *itself*, resulting in an infinite ZIP unpacking process.

There is software, most notably antivirus, that will unpack ZIP files because automatically. Zip bombs were originally designed to wreak havoc on these programs. They have since been modified to recognize a probable zip bomb and deal with it, probably by detecting it as a type of virus by itself.

Anonymous 0 Comments

One thing that the other answers don’t indicate well is how you can compress so much material into a small file. It has to do with the nature of how file compression works.

Imagine I had a 1 GB text file, but every character in the file was just the number “1.” In a compressed file, this might be compressed by simply giving the instruction, “‘1’x1GB”. And so a very tiny compressed file (7 characters in my fake example) could be turned into a gigantic file on the other end (~1 billion characters).

When your computer opens a ZIP file, it follows a set of rules according to what the file says it has inside of it. A ZIP bomb is a ZIP file with rules that are intentionally malicious. It’s basically a little file that says, “spend all of your time and energy and memory writing out meaningless junk.” (And in this case it has to be meaningless, if the compression ratio is going to be that high. The reason that ZIP files can only compress “real” data only so much is because real data has structure and variance most of the time, and so there’s only so much you can do to reduce that to instructions like this, which generally focus on repeated sets of characters.)

For an ELI5 analogy, imagine I have someone who types up my notes for me. Except I don’t have to just give them my notes; I can put instructions on the notes. So I send them a tiny Post-It note, but it says, “write the number 1 a trillion times.” In real life, the person would probably laugh and/or quit the job, but your computer doesn’t have that option — unless its programmers anticipated this problem (which some modern archive programs might), it will just dumbly follow the instructions in the file, even if it is ruinous for the overall machine’s performance.

Anonymous 0 Comments

When you open/unzip a file, your computer will usually “trust” the file. After all, this is only unfolding the package, the content of the package is executed, so there is nothing that can go wrong during the unfolding, right?

And that’s mostly true, the only thing that can go wrong is that the unfolding might take some time and some resource. But that’s to be expected, if the user want to unfold a big file, they should expect it to take a lot of time.

Here, come the first kind of trap you can do: a file which appears to be small but is compressed in some way to it takes a surprisingly high amount of time and resource to open/unzip.

.zip bombs are the extreme case of this trap. Peoples have found way to build a .zip file which is extremely small, but when opened/unfolded take so much time and resources that it might crash your computer.

How is it done? Simply put, a zip file contain a small set of explanation on how to recreate the file, it is well known that even with a small set of rules, you can describe very big things. A good comparaison to what is happening in a .zip file is the tale of the Wheat on a Chess board:

>As a reward for his service to the king, the advisor asked something very simple: “I’d like grain of wheat, as dictated by my favourite game: chess. On the first square, please place one grain of wheat, on the second 2, and the third double for a total of 4, on the fourth double again, and continue to double up until the last of the 64 squares. I will be content with that amount of wheat.” Surprised by such a simple request, the King agreed, but little he knows that he would need to sell his Kingdom and much more to provide that much wheat.

Anonymous 0 Comments

Say you have a sequence of numbers like this:

> 3 3 3 3 3 7 7 7 7 8 8 8 8 8 8

It’s five 3’s, followed by four 7’s, followed by six 8’s. So you can express it in a shorter form like this:

> 5 3 4 7 6 8

That’s basically how a zip file works [1]. You have a compressor program that transforms the long version into the short version, and a decompressor that transforms the short version into the long version.

Now what happens if we feed this simple two-number sequence into the decompressor?

> 1000000000000000000000 6

The decompressor will spit out 6666666666666666666666666666666666…, a very long sequence of 6’s.

The decompressor itself is perfectly happy to work with this input. The decompressor will output 6’s in a loop that will take thousands of years to finish [2].

However, it’s common for a decompressor to be hooked up to supporting code, and that supporting code sometimes behaves badly when the decompressor outputs a very long sequence.

For example if you set up the decompressor to write the sequence out to a disk file, it will consume all your disk space. If you set up some code that waits for the decompressor to finish, that code will be waiting for thousands of years.

[1] Actually a zip file is a lot more complicated than this. I’m greatly simplifying to keep things ELI5 level. My goal is to help you understand the principles of how a zip bomb works. Diving into the engineering details of the zip file format would just be a distraction.

[2] Assume the decompressor can process a billion 6’s per second. (This assumption is sort-of in the ballpark, but specific performance will vary depending on the specifics of hardware and software.) At this rate it will take approximately 30,000 years to write out the 1000000000000000000000 6’s.