If a .zip file contains all of the information of the original, just in less space, why does it have to be unzipped to access any of it?

790 views

If a .zip file contains all of the information of the original, just in less space, why does it have to be unzipped to access any of it?

In: 1276

43 Answers

Anonymous 0 Comments

Some great metaphors on here. I’ll try one, too.

Say we have two spies with codebooks. If a spy gets a message like “SPARROW,” they look up “SPARROW” in the codebook and it reads “Proceed with plan A immediatley.” Or maybe “SWAN” means “Abandon position, return to base by sea.”

If a spy gets an encoded message, he doesn’t know what it means until he looks it up in the codebook. The information’s there, but until he opens the book and looks up the code, he doesn’t know what to do.

ZIP files work the same way. The information’s all there, but you have to go through the process of decoding the information in order to know what it says.

Two special notes, though:

1. You don’t need to unzip the whole file. A program that understands zip files can totally reach in and decode one particular part of the zip file. A number of video games and such take advantage of this, unzipping little pieces or monster image files or whatever as needed.
2. This wasn’t part of your question, but zip files aren’t guaranteed to store thing in less space. There are some files which, when encoded, actually get bigger. That’s almost never the case in practice, but only because zip files are intentionally designed to be very effective on stuff like text documents. But if you filled a file with completely random bytes, zipping it would probably result in a slightly larger file.

Anonymous 0 Comments

There is no way to compress a 12-digit number into a 8-digit number, in general. What you _can_ do is find a method that e.g. compresses some 12-digit numbers into 8 digits, and expands some others into 14 digits. That’s what “all of the information but in less space” really means – you found, say, a reversible mapping from 12 digits to 6-20 digits.

Actual compression algorithms are chosen such that usual data like English text or a photo gets compressed, unusual data you don’t care about like gibberish or pure noise gets expanded. I say “unusual”, but that refers to the source where the data came from, such as a camera – for every “usual” file, there are many, many more “unusual” ones. For example, if you randomly reorder the pixels of a 16×16 image, you get about 256! ≈ 10^(507) images, and most of those look like uniform noise and are not going to come up naturally.

Anyway, some compression algorithms exist that allow you to decompress only certain files, or parts of a file, but even in that case, a decompression process needs to happen first – the information itself is encoded in a complicated way, to a format that isn’t directly readable.

Anonymous 0 Comments

The same reason why you have to unpack all of your Holiday decorations every year. They are packed for storage, not use. Zip files pack all the data up for maximum storage efficiency but they have to be “unpacked” before you can use that data.

Anonymous 0 Comments

If you have some flags folded up in a small suitcase, why do you need to unfold the flags to fly them?

Anonymous 0 Comments

Think of it like unwinding a scroll to read. The scroll when wound is compressed, and contains all the text, but in order to read the text the scroll must be unwound.

Anonymous 0 Comments

Computers need data to be structured and aligned in exact lengths (say 32, 64… bits) and must be able to access it in random order. Compression algorithms are able to compress effectively only long chunks of data, making them inditinguishable from random gibberish. Current computer architectures cannot random access nor distinguish data in compressed format since structure is completely lost and depends on the data being compressed, so there is no general rule to be able to do that.

Anonymous 0 Comments

If my clothes are in this suitcase, why can’t I put them on? They’re right here!

Anonymous 0 Comments

think of it as a suitcase, it contains all your clothes, but you cant wear them until you take it out of the case

Anonymous 0 Comments

Because it is made as compact as possible. A lot of repeating bits can easily be compressed because the zip file is something like. A book
But instead of wasting a whole page for 48 bits, it will say “Hey, this piece of information is actually a lot of 1’s, so just repeat 4000 zero’s, and then the next 48 bits are these for this page. The data after those 48 bits should be interpreted as a new page.

Well, why can’t programs access this? Long awnser is they can, but they have to decompress it in order to do anything meaningful with it. This takes some cpu power of the computer to decompress it. Also there are many ways to compress a file, if you want to support every type of compression you’re making your program bloated which can lead to extra bugs.

Relying on just plain uncompressed data is faster and simpler for both your pc and the programmers behind the software.

Anonymous 0 Comments

The best analogy i have to zipping is Stenography; Writing in short hand.

“cnyugtsttma5” is not english. But it can be transformed to English if you know the code used to write it

“cnyugtsttma5” > “cn yu gt st tm a5” > “can you go to the store tomorrow at 5pm”

“cnyugtsttma5” is the zipped message, “can you go to the store tomorrow at 5pm” is unzipped