The birthday paradox is the result of solving the birthday problem. The birthday problem is as follows: What is the probability that in a random set of n people, two people share the same birthday? Another way we can state this problem is: What is the minimum k number of people such that the probability that two of them share the same birthday exceeds 50%?
The answer to the birthday problem is that the minimum k is 23.
Hence, the birthday paradox states that in a group of 23 people there is more than a 50% chance that two of them share the same birthday. It is called a paradox because it seems counterintuitive, but it is actually true.
Let’s say we have 23 birthday dates. Now we can calculate how many comparisons we can make.
Let’s number the 23 birthdays from 1 to 23.
If we choose birthday 1, we can compare it with the other 22 (from 2 to 23). So, we can make 22 comparisons.
Let’s choose birthday 2, we can compare it with the other 22 (1 and 3 to 23), but one will be repeated (1 compared with 2 will be the same as 2 compared to 1). So here we have 21 new comparisons.
When we choose birthday 3, we will have 20 new comparisons (22 minus 2, 1 compared with 3, and 2 compared with 3 will be repeated).
In total, we will have 22+21+20+…+1 = 253 comparisons.
Now, we calculate the probability that two people do not have the same birthday, which is 0.492703.
Therefore, the probability that two people have the same birthday is 1- 0.492703 = 0.507297.
A derived result is that in general, you need √n choices to get a probability greater than 50% of a match.
Application of the birthday paradox in cryptography
The application of the birthday paradox in cryptography is known as the birthday attack.
This attack is made to break the collision-resistant property that is desirable in cryptographic hash functions.
A collision-resistant attack intends to find two messages that will have the same message digest or hash value.
In other words, the task is to find two messages x and y, x≠y, such that H(x)=H(y) for a given hash function H.
Let’s now use the birthday paradox general result to find out how many values we have to calculate to find a collision for a hash function.
Let’s assume that the hash function H has 2m possible outputs.
The number that we want to know is k=√2m=2m/2
This result states that the strength of a cryptographic hash function, regarding the collision resistance property, is 2m/2, where m is the number of bits of the hash value.
See below a table that summarizes this result below.
|Hash function||Size of the hash value||Collision resistance strength|
As you can see, the higher the size of the hash value, the more secure the function is to the birthday attack (collision-resistant attack).
Notice that the hash function MD5, which produces 128 bits hash values is not considered secure anymore. Also, SHA-1 which produces 160 bits is not secure and it was proved by Google. You can read more about MD5 and SHA-1 in this post and this one.