How does verifying the checksum confirm the integrity of a downloaded file, when it’s posted on the same website the file came from?

543 views

How does verifying the checksum confirm the integrity of a downloaded file, when it’s posted on the same website the file came from?

In: 27

20 Answers

Anonymous 0 Comments

There are a few situations where the checksum can help:

1. Sometimes to save on hosting/bandwidth costs, the download link will actually pull the file from a different website not under direct control of the website author. For example, a number of universities will host the files for Linux distributions. Then the website for the Linux distribution will randomly select from the list of known hosts.
2. It allows you to verify that the file was downloaded correctly. Whenever you have an application that downloads a file in parts and stitches them back together, there’s a chance that something goes wrong and some data is out of order at the stitching points. Thus the checksums are very helpful if someone is using a download manager that resumes a download when the connection is interrupted. BitTorrent clients are supposed to automatically verify the checksums. But I’ve occasionally had a Linux distro fail its checksum after download.
3. In a few cases, the checksums are digitally signed with a form of encryption. This has the added advantage that even if hacker takes over the website, they won’t be able to produce a valid checksum file. So anyone who goes through the effort of performing the verification will realize that the download has been tampered with.

Anonymous 0 Comments

If you are concerned about accidental corruption, it is HIGHLY unlikely that a large downloaded file will have the same corruption that matches a corrupted webpage / checksum file.

Alternatively, a checksum on a trusted site allows you to validate the authenticity of a file procured from an untrusted site (IE, pulling down Centos7 ISO via bittorrent and ensuring it is unmodified).

If you are protecting againt something like a site hack where someone could modify the file AND the checksum at the same time, you get into the zone of digital signatures / code signing where the workflow would require that the private keys in order to create the signature are not readily available — someone hacking the site could replace the file and its signature, but the signature would not be trusted since it is not signed by the correct key.

Anonymous 0 Comments

A checksum is a fingerprint, it is a decent and very simple method to create a form of file authentication by checksum comparison of the the website vs the file you downloaded.
Both the same, then you got the original file, one is different, then you got a modified file.
But again, the reliability of the checksum is only as great as the reliability of the website.

Anonymous 0 Comments

It’s just to protect someone from swapping the file on transit between the website and your PC (a so called man-in-the-middle attack).

Anonymous 0 Comments

It only verifies that the contents of the file you end up with on your hard drive is the same as the file the publisher used to create the checksum. You still can’t trust the file any more than you trust the publisher.

Also there is such a thing as hash collisions. Basically this means that multiple different files can lead to the same checksum. You can add a nefarious payload to a file and then modify or pad the remainder of the file so that the resulting checksum is the same. It’s pretty difficult to do but its definitely feasible.

Anonymous 0 Comments

The checksum is the result of a mathematical calculation performed on 1’s and 0’s of a file. Wherever that math is performed on that file, it should give you the same checksum value. Typically, this is used to make sure that the file you copied from Point A to Point B made it there intact.

If the question is about the integrity of the downloaded file, them telling you that when you download it, the checksum will be X, and you get X when downloaded, you got it without any corruption.

If the question is about whether the file has malicious content…that’s a different story. On a pirate site, you could find game.exe that contains a pirated version of the game you want. There can be hundreds of copies of it available for download. You see the 5 most downloaded all have different checksums. Two of them might be clean, but were “packaged” differently, and therefore with slightly different compression, they get a different checksum. One of them might be different because they only included the important stuff, and left out the nonsense to make the file smaller. The last two contain two different forms of malware.

They all can be verified using the checksums. When you download, you’ll get what you ordered if the checksums are correct…malware and all.

When I…considered pirating a long time ago with LimeWire but never would have actually done it…checksums were a way to determine the “most likely to be clean” version. The more seeds that contained the same .exe with the same checksum…the more people got the software, used it, and seeded to others. If it contained malware, most folks would delete it before (or after) opening).

TLDR: checksums are only to tell you exactly what a file will/should contain as a result of that math problem. Their value is less about verifying the download, and more about validating that you’re getting ONLY that file, and not some unwanted tagalong software in that same file. If the checksum for the software is the same in all locations, you can be certain that it hasn’t been tampered with. If the checksum doesn’t match what it is supposed to be, then you don’t want to download/open it.

TLDRTLDR: A checksum from website A should be the same checksum for software from website B. If they are different, then one of them has likely been naughty. The checksum that is the same from the most locations, is likely to be the “clean” one.

Anonymous 0 Comments

Supply chain attack can happen. A hacker may not be able to get to the website, but a malicious FTP redirect can cause someone to download a file they think is correct, but was actually downloaded from the wrong site.

This was done with code repositories, a hacker may not be able to duplicate the website, but they may be able to redirect the upload/download to steal code. I think it was the case in the Kaseya (or was it Java, I forget) hack. Malicious actors were able to redirect downloads to push a false update. If someone took the time to look at the webpage, that hash would be different than the code they downloaded. Of course that would take a manual update and not automatic.

Anonymous 0 Comments

Checksums don’t protect you from a malicious website host, they protect you from people who hack the website host, or from transmission errors. I have a file that I want to share. I want to make sure everyone who downloads it can verify that they got the same file I put up there – no corruption on the download, not been replaced with some virus-laden copy by a bad actor.

So I take the file, and run a program that reads through and creates a number – the checksum – from the bytes of my file. I post the checksum next to the download, so that someone can download the file, run it through the same program, and make sure they get the same number out. If their program gives the same checksum as what I posted, they can feel confident that they have the same file that they’re supposed to have.

For a super simple example of how the checksum part works, let’s use a very short text file and a very simple checksum algorithm. We’ll add all the bytes and look at the last digit of that sum.

File: `74 68 69 73 20 69 73 20 61 20 74 65 73 74`, Checksum: 3

If you sum all those numbers, you get 833. So the Checksum is 3. If someone downloads the file from my site and their program adds the bytes and it ends in 4, then _something_ went wrong.

Checksums get more complicated, of course, that math is far too simple to use for important stuff, but it’s all basically the same regardless of exactly how you calculate them.

Anonymous 0 Comments

You’re asking how to secure trust in a zero trust environment, and I don’t know how to ELI5 it.

You trust someone. And that someone provides you with the checksum of the file. So now you’re able to use that checksum to verify your trust in a file shared by anyone else.

If you don’t trust the person offering the checksum, then all it does is confirm that you got the same version as they had. No download glitch. No technical errors when sending the file to you. But still not a file you can clearly trust.

Anonymous 0 Comments

chekcsum checks for errors during file transfer. unfortunately, every once in a while there is a bitflip during a packet transfer. now, each packet has its own checksum but with a teeny tiny probability an error is large enough to remain undetected. This is why especially for big files, you also get a checksum.

Finally, you can use checksums to ensure that we got the right file. e.g. some install scripts have parts that work like “first download this file and before we try to do anything with it, we check its checksum with the value that we got from the file that worked during my own test”. In that case the checksum is embedded in the install script.