what is hashing and when do you use it



Its different then regular encryption right?

In: Technology

Hashing is the same as encryption except its irreversible. Once you hash something, you can’t go back. In comparison, you can encrypt a message, send it to someone, and (assuming they have the key) they can decrypt it to read the original message.

The most obvious use-case for hashing is for password storage. Reddit doesn’t actually know your password; it knows a *hash* of your password. When you enter your password, Reddit will hash it and check if it matches their stored hash. The benefit here is that Reddit has none of the liability associated with knowing your password, and if there’s a database breach, you (should) still be safe. The hackers would know your hash but not your password (remember: you can’t go backwards!).

Another use-case is verifying data integrity. If I have a file, and you want to know if it’s changed since the last time you looked at it, we can both hash the file and compare. That way I don’t have to send you the entire file over a slow internet connection.

Use cases for hashing go on-and-on, but I’ll leave it there.

Hashing is the process of turning an arbitrary input to a structured output, usually with fixed length. For example, a function that takes some text (of any length) and maps it to a 3 digit number is a hash function. Many such hashes exist, some are better and some are worse.

Hashing is entirely different from encryption/cryptography etc. but is somewhat related. Many aspects of cryptography uses hashes (and properties of good hashes) to ensure security.

You grab some taters and start hashin’ then fry it up in pan and you’ve yourself some nice hot hash browns my lad! 😂

Easiest example of hashing is keeping track of these 5 numbers:




By just remembering which digit they end in. 5, 3, 4. Those hashes are easier to keep track of. If you get a hash of a new number that’s 6, you know it’s not a match of anything else you have. It loses data and you can’t get it back so it’s really not encryption.

That hash algo of “what’s the last digit?” is lame and collisions are abundant. A good algo has improbable collisions and is very difficult to make a target hash.

They’re used to verify nothing changed in big files, identify duplicates, and databases.

A hash function just takes an input and sorts it into one of several boxes. It picks the box arbitrarily but it should pick the same box for the same input. And ideally you want exactly 1 input per box having more is called a “hash collision”.

So to minimize the chance of hash collisions without making the hash function complex and/or predictable you just increase the number of boxes. Idk 256 bit would be 2²⁵⁶ so as we know that 2¹⁰ = 1024 ~ 10³ we and as adding exponents means multiplying them with the base we could also write that as 2²⁵⁰*2⁶ = (2¹⁰)²⁵ *2⁶ = (10³)²⁵ *2⁶ = 10⁷⁵*2⁶ = 64*10⁷⁵ = 6.4*10⁷⁶

So savely more options than atoms in the universe even if you divide it by the ~ 65000 symbols in the unicode.

Though while 256 bits are a lot of options it’s not a lot of information 256 yes or no questions where you have to send the questions separately, that’s not much. So in terms of encryption it’s not all that useful as you’d maybe need to write a novel as input to receive a short string of 256 ones and zeros as output. The other way around would be more useful.

Provide something like a password or phrase and get access to a whole novel. And there you have symmetric and asymmetric encryption method. So without going into too much detail, symmetric encryption uses the same key for encryption and decryption. So think of a safe, you open it deposit stuff and close it. The key to open (decrypt) and to close (encrypt) is the same, it’s symmetric. Asymmetric encryption on the other hand is more like locked mail boxes. Where you have a public key (your mail box) and a private key (the key to your mailbox). So encryption is people throwing their mail in your mail box. After which neither they nor anybody else can reach it which opening the box. While decryption is using your key to open the mail box and taking out the mail.

That way encryption and decryption use different keys, asymmetric. This is useful if you have to send keys over insecure channels because here the public key can very much be public, it doesn’t matter if people know where your mailbox is unless they have the key to open it (assuming it’s otherwise unbreakable).

But back to hashes. They are only moderately useful to encrypt information but you can encrypt passwords quite neatly and store them somewhere without giving it away to the person storing it. You just enter it into a hash function and store the hash. If you use it again, it will match with the hash in the database if not, then it was wrong. So if the passwords are stolen that information will be next to useless. And if you want to go a step further you can “salt” the hash, that is add some random noise to the password so that even if it is broken it’s still not your password but your password + noise.