The birthday paradox is the result of solving the birthday problem. The birthday problem is as follows: What is the probability that in a random set of n people, two people share the same birthday? Another way we can state this problem is: What is the minimum k number of people such that the probability that two of them share the same birthday exceeds 50%?

The answer to the birthday problem is that the minimum k is 23.

Hence, the birthday paradox states that in a group of 23 people there is more than a 50% chance that two of them share the same birthday. It is called a paradox because it seems counterintuitive, but it is actually true.

## General explanation

Let’s say we have 23 birthday dates. Now we can calculate how many comparisons we can make.

Let’s number the 23 birthdays from 1 to 23.

If we choose birthday 1, we can compare it with the other 22 (from 2 to 23). So, we can make 22 comparisons.

Let’s choose birthday 2, we can compare it with the other 22 (1 and 3 to 23), but one will be repeated (1 compared with 2 will be the same as 2 compared to 1). So here we have 21 new comparisons.

When we choose birthday 3, we will have 20 new comparisons (22 minus 2, 1 compared with 3, and 2 compared with 3 will be repeated).

In total, we will have 22+21+20+…+1 = 253 comparisons.

Now, we calculate the probability that two people do not have the same birthday, which is 0.492703.

Therefore, the probability that two people have the same birthday is 1- 0.492703 = 0.507297.

A derived result is that in general, you need √n choices to get a probability greater than 50% of a match.

## Application of the birthday paradox in cryptography

The application of the birthday paradox in cryptography is known as the birthday attack.

This attack is made to break the collision-resistant property that is desirable in cryptographic hash functions.

A collision-resistant attack intends to find two messages that will have the same message digest or hash value.

In other words, the task is to find two messages x and y, x≠y, such that H(x)=H(y) for a given hash function H.

Let’s now use the birthday paradox general result to find out how many values we have to calculate to find a collision for a hash function.

Let’s assume that the hash function H has 2^{m} possible outputs.

The number that we want to know is k=√2^{m}=2^{m/2}

This result states that the strength of a cryptographic hash function, regarding the collision resistance property, is 2^{m/2}, where m is the number of bits of the hash value.

See below a table that summarizes this result below.

Hash function | Size of the hash value | Collision resistance strength |

MD5 | 128 | 2^{64} |

SHA-1 | 160 | 2^{80} |

SHA-256 | 256 | 2^{128} |

SHA-512 | 512 | 2^{256} |

As you can see, the higher the size of the hash value, the more secure the function is to the birthday attack (collision-resistant attack).

Notice that the hash function MD5, which produces 128 bits hash values is not considered secure anymore. Also, SHA-1 which produces 160 bits is not secure and it was proved by Google. You can read more about MD5 and SHA-1 in this post and this one.

**Related posts:**