What exactly happens when a ZIP file is extracted?

214 viewsOtherTechnology

I know the structure of a ZIP file, but how does a computer go about reassembling it into a directory? I’m not asking about compression algorithms, what I mean is in what order does it decompress the files, and in what order does it write them to storage?

In: Technology

4 Answers

Anonymous 0 Comments

[removed]

Anonymous 0 Comments

There’s an entry in the zip file for every file or folder and the extraction software can also figure out the structure from file names : for example if an entry has the name “folder/filename.txt” the software recognizes / character as folder separator and knows it needs to create.the folder “folder” if it does not exist, and put “filename.txt” inside that folder.

Fun fact zip files allow paths like “../../../folder1/folder2/file.extension” which would allow writing files outside the folder a user specifies, but most decompression programs will refuse to write outside specified destination folder and either not create file or ‘flatten’ the path (pretend the “go up one level” notation “..” doesn’t exist)

The decompression program.will .read.the records.at.the end of the file and will usually extract the files sequentially, as the records are read. Some programs may detect if two or more records point to same file name and skip writing the data associated with first records to disk as it would be overwritten later anyway. It’s also possible to have empty spaces in zip archives, as a result of deleting entries in the zip file.

Anonymous 0 Comments

There is no required or even recommended order. The decoder program can decompress files in any order it sees fit. On a multicore CPU, it can even decompress several files at the same time. ZIP format documentation also doesn’t say if a file should be written as a single unit, or if it can be written in chunks as the decompression progresses – this decision is also left for the decoder.

However, if the decoder program doesn’t do anything fancy, it is likely to use one of two orders: either it will decompress files in the order they appear in the main area of the ZIP, or (most likely) in the order they appear in the Central Directory section in the end of the ZIP.

The files are most likely written in chunks of few kB – which means you can observe partially written files when decompression is in progress.

Anonymous 0 Comments

Zip files don’t have a tree structure, just a list of entries. Each entry has a filename, which may or may not include subdirectories – for example if you have a folder inside the zip named “A” which contains two files “b.exe” and “c.txt”, then you’ll simply have two entries named “Ab.exe” and “Ac.txt”. The order of the entries within the zip file is not important – you can have one file from directory A, followed by a file from directory B, then another file from A etc.

Most zip programs would just extract the entries sequentially, creating the necessary folders and files as needed.