AnswerCult

Question

436 viewsJanuary 2, 2024

Question 100.55K September 20, 2022 0 Comments

Files are just a bunch of bits, how is it possible to make the filesize smaller without removing any of those bits?

In: 3

5 Answers

You are viewing 1 out of 5 answers, click here to view all answers.

Answer 1 · 2022-09-20T03:37:48+00:00

If the sequence of bits is completely arbitrary, it isn’t. You can’t make one possible file smaller without, in principle, making another possible one bigger.

But most applications, whether that’s images or audio or text, tend to have specific patterns more often than other specific patterns. In text, for example, the strings “at” or “ed” are common, while the string “xq” is very rare. So if you use a compression approach that makes “at” or “ed” smaller and “xq” bigger, you will, on average, tend to make your file size smaller.

More generally, the smallest size you can compress files to (given no knowledge except the file itself) is given by their [information entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)). The “more random” the strings of bits, the higher the entropy. When each string of bits is equally likely, the information entropy is 1 bit per bit, which means you can’t compress things losslessly. But real applications tend to have much less.

English text encoded as ASCII, for example, takes up 1 byte (8 bits) per character, while the information entropy of English text is usually around 1 *bit* per character (because some strings are much more common than others). So you can losslessly compress English text by a factor of about 8.

AnswerCult

How does lossless compression work?

5 Answers

Search questions

Popular Questions

Latest Answers