There’s some good answers about how lossless compression works, and that’s really useful. But the answers for lossy compression are lacking a bit.
There’s also lossy compression, where some of the data is literally discarded during compression, then when you reopen the file, the computer basically makes educated guesses about what used to be there. As an example, you could remove all of the u’s following q’s, the S’s from the end of plural words, the apostrophes from contractions, and all of the punctuation. It’s pretty likely that you could look at that text and, given the rules that the computer used when compressing the file, figure out what was supposed to go where based on the rules and the context. I.e:
This is the original text, which I thought up rather quickly. It’s not the best example possible, but it should work well for our purposes.
Becomes:
This is the original text which I thought up rather qickly Its not the best example possible but it should work well for our purpose
Not really substantially shorter in this case, but we also didn’t have a very optimized algorithm for it. More rules make the file smaller and smaller.
It’s not really ideal for text, but it works pretty well for a lot of artistic data where it just needs to be close enough. Common examples of lossy-compressed files are JPEG pictures and MP3 audio files. It doesn’t matter if we get this specific pixel in our picture the exact right color, just so long as it’s about right given the surrounding pixels.
Latest Answers