Say you have a sequence of numbers like this:
> 3 3 3 3 3 7 7 7 7 8 8 8 8 8 8
It’s five 3’s, followed by four 7’s, followed by six 8’s. So you can express it in a shorter form like this:
> 5 3 4 7 6 8
That’s basically how a zip file works [1]. You have a compressor program that transforms the long version into the short version, and a decompressor that transforms the short version into the long version.
Now what happens if we feed this simple two-number sequence into the decompressor?
> 1000000000000000000000 6
The decompressor will spit out 6666666666666666666666666666666666…, a very long sequence of 6’s.
The decompressor itself is perfectly happy to work with this input. The decompressor will output 6’s in a loop that will take thousands of years to finish [2].
However, it’s common for a decompressor to be hooked up to supporting code, and that supporting code sometimes behaves badly when the decompressor outputs a very long sequence.
For example if you set up the decompressor to write the sequence out to a disk file, it will consume all your disk space. If you set up some code that waits for the decompressor to finish, that code will be waiting for thousands of years.
[1] Actually a zip file is a lot more complicated than this. I’m greatly simplifying to keep things ELI5 level. My goal is to help you understand the principles of how a zip bomb works. Diving into the engineering details of the zip file format would just be a distraction.
[2] Assume the decompressor can process a billion 6’s per second. (This assumption is sort-of in the ballpark, but specific performance will vary depending on the specifics of hardware and software.) At this rate it will take approximately 30,000 years to write out the 1000000000000000000000 6’s.
Latest Answers