How does verifying the checksum confirm the integrity of a downloaded file, when it’s posted on the same website the file came from?


How does verifying the checksum confirm the integrity of a downloaded file, when it’s posted on the same website the file came from?

In: 27

It’s there to verify that the file has been downloaded correctly. That is, you download the file, calculate the checksum of that downloaded file and check it against the checksum posted on the site. If it matches then the file was likely downloaded correctly. If not then the file was corrupted during the download.

Do you mean checksum as /u/nmxt is saying or do you mean cryptographic hash like SHA1 or MD5SUM or something? Are you worried that someone might put up a fake website with a fake download and the hash of that fake download?

It’s not a secret, or an encryption technique that needs to be hidden. It’s just there to help verify you completely downloaded the correct file. It can’t be spoofed, because the checksum isn’t assigned. It is generated using the file itself. So, even if someone somehow did manage to replace the file you really wanted, it can’t generate the same checksum (I mean, *technically* it could be done, but it would be way too damned much work and pretty obvious)

Checksum functions go through every bit (as in subset of byte) of a file and include it in a calculation. If a single bit is off, the final number will be hugely different; if your checksum calculation is different from the posted one, then something happened to one (or more) of the bits while the file was being transferred and the file has been “corrupted”, which can cause errors of different levels depending on just what the corruption is.

Checksums are posted next to where a file can be downloaded because they are extremely small in comparison to the file. So if somebody is unsure about a file’s integrity, they can just run a local checksum function and compare it to a single number (which will be smaller than even the page hosting it) instead of downloading the entire file again (which can be arbitrarily large).

Edit: there is also purpose in using checksums to ensure a file is actually what it says it is; renaming a file to make it look like something desired while actually being a virus is a classic malware attack. It is extremely difficult to change an average file without changing its checksum, so checksum verification is also used to reduce trojan horse type file changes.

(checksums are also used locally by programs to make sure files haven’t been tampered with; if a game doesn’t want its users to cheat by editing save files, they can use a checksum and only load save files that are verified, as an example)

A lot of people aren’t getting the crux of your question. As you say the checksum and the file often come from the same site, this does nothing for increasing trust. However there are situations where you might retrieve the checksum from a trusted source and the file from elsewhere. If the file is big you could go get it via BitTorrent then download just the checksum from the trusted site to check the file wasn’t tampered with. But yes if you don’t trust the source of the checksum it means little for you.