The Secure Hash Algorithm 1 (SHA-1) is a hash function that creates a message digest that can be used to prove if a message or file has been modified. It is not secure anymore as it was broken some years.
Table of Contents
- Processing a 512-bits block (HSHA-1)
- Security of SHA-1
- Similarities and differences between MD5 and SHA-1
- Python code to generate the SHA-1 message digest
- SHA-1 message digest from the command line
The SHA-1 algorithm is based on the Merkle-Damgård construction. Find below an illustration of this construction.
The original message is processed in blocks. The blocks are concatenated using an initial vector (IV) and a function f. The value of the initial vector will change with each concatenation and at the end, we obtain the hash value.
In the case of the SHA-1 algorithm, the output will be 160 bits.
Find below the general scheme for SHA-1.
As per the Merkle-Damgård construction, padding is added at the end of the message and the message is processed in blocks.
The blocks size is 512 bits.
The initial vector has 5 words of 32 bits each. After we do all the operations over the initial vector IV, we get a message digest of 5×32=160 bits.
Let’s examine what happens when we process each 512-bits block (HSHA-1).
Processing a 512-bits block (HSHA-1)
In the figure above you can see the details of how the Initial Vector is modified. Notice that all the sums are sum mod 232.
Each HSHA-1 applies 80 operations (for each operation we will use a different word W) to the Initial Vector. The 80 operations are divided into four rounds of 20 operations each.
As the input block is 512 bits, it only has 16 words that are used for the first 16 operations. Therefore, the rest of the words are calculated as w[t] = (w[t-3] ⊕ w[t-8] ⊕ w[t-14] ⊕ w[t-16]) <<< 1, for 15 < t < 80.
After the fourth round, we get a new vector A’B’C’D’E’. This one will be the Initial Vector we use to process the next block of 512 bits.
Security of SHA-1
In February 2017, Google announced a practical attack that creates a collision for the SHA-1 algorithm. Source.
After this result, the SHA-1 algorithm is not cryptographically secure anymore.
Similarities and differences between MD5 and SHA-1
If you want to learn more about MD5, you can read this post.
- Both MD5 and SHA-1 are based in the Merkle-Damgård construction.
- They use four functions (F,G,H,I) to process 512-bits blocks to produce a message digest.
- Both compress the message to a predetermined size.
- Both use 4 rounds to perform all the operations in one block of 512 bits.
- MD5 performs 64 operations and SHA-1 performs 80 operations.
- The vector K changes within a round in MD5 and it is fixed for the whole round in SHA-1.
- The shift is fixed to 5 in SHA-1 and it changes in MD5.
- In one operation MD5 changes only one word from the Initial Vector and SHA-1 change two of them.
- MD5 produces 128 bits digest using an Initial Vector of four 32-bits words and SHA-1 produces 160-bits digest using an Initial Vector of five 32-bits words.
Python code to generate the SHA-1 message digest
import hashlib message_digest = hashlib.sha1(b'This is my message') print(message_digest.hexdigest()) message_digest = hashlib.sha1(b'This is my message.') print(message_digest.hexdigest())
After executing the code above, you will get the following result:
rafel@Rafaels-iMac crytography % python3 sha-1.py 4618ea8e820ddf616f19b9d48b8a7eb2c1c0a107 2e9181e8bb0f0ae82d182ead3100c16f9ce99ad3 rafel@Rafaels-iMac crytography %
Notice how the message digest changes by just adding ‘.’ at the end of the message.
SHA-1 message digest from the command line
To obtain the SHA-1 message digest using the command line we can use the application shasum in Mac OS, or sha1sum in Linux-based distributions.
Find below an example of how to obtain the same message digest as in the Python example above.
rafel@Rafaels-iMac ~ % echo -n This is my message | shasum 4618ea8e820ddf616f19b9d48b8a7eb2c1c0a107 rafel@Rafaels-iMac ~ %