What is a Hash?

… simply explained

Anthony Albertorio
Coinmonks
Published in
2 min readMay 3, 2018

--

What is a Hash?
A hash is the result of taking in an arbitrary set of data/bits (be it a single character, an video, or the entire Library of Congress) and using an algorithm to transform the input into a fixed size output of data/bits, usually a hexadecimal number.

So, regardless of the amount of information that is placed into the hash, you will always get a unique hash of the same length.

Anything goes in and fixed-length hash comes out.

More in-depth, this algorithm is a one-way function that takes in data in one direction and outputs a wildly unique string of hexadecimal numbers on the other. One change in the input data results in a wildly different output string.

It’s very unlikely that you can guess the output of a hash from its input. It’s even less likely that you can guess the input from the fixed-size output.

The same data will always have the same unique hash. However, collisions are unavoidable when working with large data sets. When two inputs have the same hash, this results in a collision.

Not enough room for all the pigeons.

Why do collisions happen? This is explained by the pigeonhole principle. Simply explained, the pigeonhole principle is defined as trying to place n + 1 items (or pigeons) into n spaces (pigeonholes). Logically, there will be at least 1 instance where 2 items (or pigeons) are in the same space. Try doing this with 3 crumbled up pieces of paper (n + 1) and 2 wastebaskets (n). You will find that at least one wastebasket with two pieces of paper.

So, every hash has collusions. However, good hashing algorithms provide make it hard to find such collisions, also known as collision resistance. Collusion resistance does not mean the absence of collisions, just that they are hard to find.

Some people try to break hashing algorithms with brute force attacks, which is basically computer making really fast guesses. This type of attack is called a hash collision attack. As computing gets faster and cheaper the risk of older hashes being successfully attacked becomes more likely, like those found by Google Researchers for the SHA-1 hashing algorithm in 2017.

--

--

Anthony Albertorio
Coinmonks

Community Builder at ConsenSys. Blockchain Dev + Organizer of meetup.com/EthBuilders ♢Albertorio.eth