what is the major difference between ZIP, RAR, 7z and other compression algorithms?

222 views

Do they use vastly different algorithms? Are any of those tool particularly ideal for one type of scenario over another?

In: 2

4 Answers

Anonymous 0 Comments

They are all pretty much the same thing with some minor tweaks to the algorithms.

The basic algorithms all follow the same basic process. They are a two step process of a dictionary step, and then an entropy coding step.

The dictionary step basically looks though the data and then builds a dictionary with abbreviations in it. It’s quick, but not particularly efficient. For example after dictionary coding the phrase “the cat sat on the mat” might be coded as “!=the;*=at ! c* s* on ! m*”

The problem with this type of dictionary building step is that it goes through the data in order, and takes no account of how frequently phrases occur. Rarely used sequences might get the shortest abbreviation.

Entropy coding techniques vary. However, the commonest forms look at the structure of the data to find parts which repeat or almost repeat. They are a bit like dictionary techniques, but they look at how much patterns are repeated or repeated with minor changes, and put shorter abbreviations for the most common ones, and longer abbreviations for less common sequences. There are varying degrees of cleverness in how algorithms score what is considered likely and unlikely when allocating abbreviations.

The entropy coder is where most of the differences between Zip, Rar and 7z are – Zip uses the Huffman algorithm for this step; Rar uses an algorithm called prediction by pattern matching with information inheritance; 7z uses Markov chains.

You are viewing 1 out of 4 answers, click here to view all answers.