What is a Cryptographic Hash Function?

A cryptographic hash function is an algorithm that has two main properties: it is a one-way function and is collision-free. As per function definition, it transforms one input into only one output. By one way function we mean that is computationally infeasible to find the input related to a given output. The best attack known is the brute force attack, which is computationally infeasible because of the number of possible solutions to test. By collision-free, we mean that two different inputs will generate two different outputs.

The main use of cryptographic hash functions is to determine if certain data is original (unmodified) or not.

Keep reading to find out more applications and how you can easily use them in the Python programming language and the terminal/console/command line.

How does hash function work?
Security Requirements for Cryptographic Hash Functions
Types of cryptographic hash functions
Cryptographic Hash function examples
Applications of cryptographic hash functions
- How does it work?
- Cryptography Hash function application in Bitcoin
Cryptographic Hash Functions in Python
Creating messages digest from the console/terminal
Summary

How does hash function work?

See below a general description of how a hash function works.

Notice, that the input can be of any size. However, the output will be of a fixed size, no matter what the size of the input is.

The output size will depend on the algorithm used. For instance, SHA1 produces a 160-bit size output, SHA256 produces a 256-bit output size.

Security Requirements for Cryptographic Hash Functions

Before discussing some examples, we first need to know what the security requirements for cryptographic hash functions are. Notice that this type of hash function has extra requirements than other types of hash functions.

The security requirements for a cryptographic hash function HF are:

Variable input size: You can apply HF to an input of any size.
Fixed output size: HF will always have the same output size.
Efficient: HF is easy to calculate.
Pre-image resistant: for any given output y, it is computationally infeasible to compute x, such as x = HF(y).
Second preimage resistant: for a given input x, it is computationally infeasible to find another input y (different than x), such as HF(x) = HF(y).
Collision resistant: it is computationally infeasible to find x and y, such
Pseudorandomness

The goal of these requirements is to make the functions computationally infeasible to tamper/break.

Types of cryptographic hash functions

There are two types of cryptographic hash functions:

keyed: This type of function uses a secret key, to create the hash value. An example of this function is HMAC. This function is used as a message authentication code (MAC).
unkeyed: Hash functions that only takes as input the data and returns the hash value. Examples are the SHA-2 and SHA-3 families.

Cryptographic Hash function examples

MD5 and SHA (also known as SHA-0) are not considered cryptographic functions anymore because both have been broken. These two does not comply with the collision-resistant security requirement.

SHA-1 is like MD5 and SHA-0 and even though it hasn’t been broken yet, it has been phased out.

The SHA-2 family of algorithms were released as part of a Federal Information Processing Standard (FIPS). These algorithms are approved hash algorithms and are safe to use.

Currently, the SHA-3 family of algorithms is the new standard released NIST. This new family of algorithms is comprised of four approved hash algorithms SHA3-224, SHA3-256, SHA3-384, and SHA3-512; and the “extendable-output” functions (XOFs): SHAKE128 and SHAKE256. The XOFs functions are recommended subject to additional security considerations. Source.

Notice that there are many hash functions examples, like MD5 and SHA-0. But those are not considered cryptographic hash functions because does not meet all security requirements.

Applications of cryptographic hash functions

Cryptographic hash functions have several applications. The most popular ones as message authentication and digital signatures.

We can use hash functions to determine whether a message has been altered. In other words, to check the message integrity. This mechanism is called Message Authentication in cryptography. It basically checks if the message (data, file, etc.) received is exactly what was sent by the sender. With this type of check in place, we can avoid attacks like the Man in the Middle. If we use a cryptographic hash function as message authentication, the value is also called a message digest.

How does it work?

The sender creates a message and calculates the hash value (message digest). After that, the sender sends both the message and the message digest. Then, the receiver calculates the hash of the message and compares it with the hash received. If they differ, then the message or the message digest were modified during the transmission.

In the case of digital signature, cryptographic hash functions are used to verify if the signature is valid or not.

A sender calculates a hash of a message, encrypts the hash with his/her private key and send it. The receiver calculates de hash of the message on one side, then decrypts the encrypted hash using the public key of the sender and compares the two hashes. If they are equals, the signature is valid, otherwise is not valid.

There are also other examples where the use of a hash value is relevant.

For instance, one-way password files. Usually, operating systems store the user’s password in a one-way password file. At the time the user logs in, the operating system calculates the hash of the password and compares it with the one stored. If they match, then the user is authorized. The important thing here is that, even if an administrator can access that file, it cannot know what your password is.

This approach is also used in Databases that stores users’ credentials. If you store the hash of the user’s password, the database admin cannot see what the user password is.

However, if you just store the plain text password in the database, everyone with access to the database can see your password. Therefore, you will have a huge security breach on your system.

Cryptography Hash function application in Bitcoin

Bitcoin is a cryptocurrency. Yes, that crypto comes from cryptography.

A simplistic way to describe the technology behind Bitcoin is the following: Bitcoin is powered by a distributed technology called the blockchain. The blockchain is a sequence of blocks that store the transactions (buy, sell) made with the digital coin.

One of the mechanisms that make the bitcoin network secure and reliable is the immutability of the transactions. This means, once a transaction is recorded in the blockchain, it cannot be modified.

To achieve this immutability, every time a new block is created, it contains a hash of the previous block.

Therefore, if someone tries to change a transaction in a block that is already published, the subsequent block’s hash will change. When the network sees this type of change reject the modified block. So, all the blocks, once published in the network, are immutable.

Cryptographic Hash Functions in Python

In Python, we have the module ‘hashlib’. This module implements the algorithms SHA1, SHA224, SHA256, SHA384, SHA512 and also HMAC as a keyed cryptographic hash function, among others.

See the example below.

import hashlib
if __name__=='__main__':
    print ('sha256 for the message Hello World :', hashlib.sha256('Hello World').hexdigest())
    print ('sha256 for the message Hello world :', hashlib.sha256('Hello world').hexdigest())

If you execute the example above, you will get the following result:

Notice that a small change on the text (just a ‘w’ in one message is uppercase and in the other one is lowercase), create a huge change in the output or message digest.

Creating messages digest from the console/terminal

In macOS, we have the console application shasum. See below the help of this command/application where you can see the algorithms that are implemented.

shasum macos command to create message digest using SHA-2 family

If you are running Linux, you can use the command sha256sum followed by the name of the file.

You can find a complete explanation of this command in this link.

If you are using windows, you can use the built-in certUtil utility.

Summary

Cryptographic hash functions are mostly used to determine if a certain message was modified or not. In other words, it is used as an integrity verification mechanism.

It is also used in other areas, like digital signatures, to store one-way password files.

Cryptographic hash functions differ from just hash functions. The difference is in the security requirements that a cryptographic hash function must have.