The actual algorithm is detailed here: [https://www.simplilearn.com/tutorials/cyber-security-tutorial/sha-256-algorithm#what_is_the_sha256_algorithm](https://www.simplilearn.com/tutorials/cyber-security-tutorial/sha-256-algorithm#what_is_the_sha256_algorithm)
SHA256 is a hash algorithm. It takes in a bunch of bytes (string of 1’s and 0’s) of any length, we’ll call that the Input, and outputs a string of 1’s and 0’s that’s always 256 bits long, we’ll call that the Hash, with the following properties:
-Any change to Input (even a single bit) is virtually guaranteed to change the Hash in an unpredictable way
-Given just Hash, there’s no way to recreate Input
-Given Input, it’s “easy” to calculate Hash
This generates a digital “fingerprint” of Input that’s much smaller than Input but is (to a fairly high level of statistical certainty) “unique” to Input, so if Input changes then Hash changes too. This is handy for verifying if Input got changed somewhere along the way, or for verifying that a person actually knows Input without having to actually share Input.
The actual algorithm can be explored at the link above, but it basically “pads” the Input to make it a convenient length, chops it up into blocks, then does math between the blocks and a pre-defined key (“random” numbers), with the output of each block feeding into the next. Mathematically, this basically “mixes” all the bits from Input together and squashes them down to an end result of 256 bits that all depend on all the original bits in Input.
Quick and dirty…
It creates a 256-bit hash value for a given input of any size.
This value is “unique” for a given input and sufficiently complex enough to avoid collisions (duplicate hash values for different input data.) This makes it useful for creating fingerprints of digital files.
The hash is a one-way function in the fact that is mathematically impractical to attempt working through it backwards from the hash value to the original data.
The steps are too involved for getting into here. This site has an overview of the steps involved in creating the hash value.
https://qvault.io/cryptography/how-sha-2-works-step-by-step-sha-256/
The algorithm itself is maybe ELI18, unless you’re a childhood prodigy. There’s a whole lot of complex math going on. But maybe ELI5 leaving quite a bit out:
Make input binary and mess with the length until it’s divisible by 512 because we are going to be working with 512-bit chunks of data.
Create some values based on prime numbers.
Copy a chunk into an array and do some math on that using those values to scramble it a bit. Do this 64 times, with each iteration using results from the previous round for its own computations.
Then we do this all again for the next chunk.
The hash is the what we get when we have no more chunks to process.
The important take home is that every step has a basis in previous steps, so changing one bit anywhere along the line will completely screw up the result as errors compound. And this is what we’re counting on for a hash.
I have hashed video files several hundred megabytes long. Because of this constant iteration, one change to one bit of that video file produces a completely different hash (yes, I tested that just for fun).
Latest Answers