I understand it’s to save file space, but like…how?
Follow-up: If you have to compress a photo to save on file size (when emailing, for example) couldn’t it just send essentially a text file on how to uncompress it back to its original size and quality? (ex. “Set the resolution to XxY and add X color at Y pixels”)
Thank you!
In: Engineering
Compression works by using short codes for common data, and long codes for rare data.
For example, the standard text code uses 8 bits for every letter, which is not efficient for English. Letter E is the most common letter in English – so we can give a short 6-bit code for it. Letter Z is the least common – so we can give it a longer 11 bit code. This substitution will make average English text shorter. If some letter doesn’t appear at all – it doesn’t need a code.
It can be compressed even further, if we consider the words – not every combination of letters is a word. We can make a dictionary of all English words and assign them codes according to their frequency – this will compress average text to 25% of its original size.
ZIP uses a combination of a dictionary and frequency coding. It starts from assumption that all letters are equal and there are no words, but as it reads the file – it keeps the tallies and adapts the coding to the text. That means, that the beginning of the text is always badly compressed – but it becomes better and better later.
Note, that no compression can compress every file – there are always files that actually get longer, even if just by 1 bit. A completely random stream of characters cannot be compressed by **any** method – its uncompressed form is already the shortest possible.
>couldn’t it just send essentially a text file on how to uncompress it back to its original size and quality? (ex. “Set the resolution to XxY and add X color at Y pixels”)
This will actually make most pictures much longer – you waste a lot of bits to say “Set the resolution” and “Add color”. All current picture formats just demand that info to be listed in some specific order – so the PNG reader doesn’t need to read “Set resolution” – it just knows that “5th number from the beginning is resolution”.
Latest Answers