Home > Reverse Engineering > Generalized birthday paradox — keygenme3 by Dcoder

Generalized birthday paradox — keygenme3 by Dcoder

The birthday problem [0] asks what’s the probability that among n people at least two of them have the same birthday. The “paradox” is that the answer is counterintuitive — in a group of 23, the probability is close to 50%.



To see that, let q(n)=1-p(n), where p(n) is the probability of encountering at least one pair of equal birthdays in a group of n people. q(n) is the probability that in a group of n, birthdays are pairwise different. Assuming there are 365 days in a year, q(2)=\frac{365}{365}*\frac{364}{365} — first person is free to have any birthday, the second one can pick any day from the remaining 364. By generalizing this reasoning, we arrive at: q(n)=1*(1-\frac{1}{365})*...*(1-\frac{n-1}{365}), for n \leq 365 (for n > 365, q(n)=0).


Above graph shows how quickly p(n) rises.

If we replace birthdays with values of a hash function H, birthday paradox provides a way to find collisions for H. Algorithm is quite simple:


  1. S \gets \emptyset, i=0,
  2. pick random x (without repetition) and compute y=H(x),
  3. if y \in S, return YES,
  4. S \gets S \cup \{y\},
  5. i \gets i+1,
  6. if i<k goto 1, else return PROBABLY NOT.

Algorithm runs in O(k), but this isn’t really useful — we want complexity to be parametrized by probability of finding a collision. Formula for p(n) isn’t easily invertible, so we need to approximate it by something simpler. One easy to derive approximation is q(n)=e^{-n(n-1)/2*365} \approx e^{-n^2/2*365} (see [0]), so p(n)=1-e^{-n^2/2*365}.

We can extend this reasoning and conclude that if there are m “days” and n persons, then the probability that there are at least two of them with the same birthday is p(n,m)=1-e^{-n^2/2m}. Now, if we assume that m=H, where H is equal to the size of image of our hash function, then by inverting the formula for p(n,m) (see [1]), we get n(p,H)=\sqrt{2Hln(\frac{1}{1-p})}, where p is the desired probability of collision. For p=0.5, n(p,H) \approx 1.17\sqrt{H}. This approximation shows that for (for example) MD5, CHECK-IF-COLLISION-EXISTS would loop around 2^{64} times (MD5 hashes are 128 bits long) and return YES with probability 0.5.

Generalized birthday paradox


There are multiple generalizations of this problem, but we are interested in a result of David Wagner from 2002 [2]. Citing from the paper’s abstract:

“We study a k-dimensional generalization of the birthday problem: given k lists of n-bit values, find some way to choose one element from each list so that the resulting k values xor to zero. For k = 2, this is just the extremely well-known birthday problem, which has a square-root time algorithm (…). In this paper, we show new algorithms for the case k > 2: we show a cube-root time algorithm for the case of k = 4 lists (…).”

For details, see the paper [2]. Key observations from our point of view are:

  • algorithm for k=4 runs in cube-root time,
  • xor can be replaced by addition modulo a power of 2.

Armed with this knowledge, we can take a look at the problem from KeygenMe3 by Dcoder [5].


Keygenme3 by Dcoder


Here’s a high-level pseudocode of the protection scheme:

sum = 0
name_hash = hash(name)
for i in 0..15:
    x = (i<<8) | serial[i]
    sn_hash = hash(x)
    sum += sn_hash

if(sum == name_hash)

Hash function used is SipHash [3], which is a lightweight PRF (pseudorandom function).

You might recognize the problem stated by Dcoder as the subset sum problem (SUM) [4]. SUM is NP-complete and it’s unknown if there is a polynomial time algorithm that solves it exactly. The best exact general-case algorithm has complexity O(2^{n/2}), where n is the size of the set we can pick numbers from. In our case n=16*256=2^{12}, which makes 2^{n/2} prohibitively large.

There’s a probabilistic algorithm (described in [2]) that solves SUM in O(2^{b/3}) time, where b is equal to the bit size of numbers used. In our case b=64 bits, so 2^{b/3} \approx 2^{21}. The algorithm takes 4 lists of numbers: l_1, l_2, l_3, l_4 and returns a set of tuples \{(x_1,x_2,x_3,x_4): x_1+x_2+x_3+x_4=0\}, where x_i \in l_i. The only requirement is that we can ‘extend’ all lists freely. All details are clearly explained in [2], so I won’t repeat them here.

At first sight, there are few problems:

  1. we have 16 lists of numbers, instead of 4 (there are 16 bytes in a serial),
  2. each of our 16 lists contains exactly 256 numbers (each one of 16 positions can take one of 256 possible values),
  3. the target sum is (almost always) nonzero.

We can deal with 1 and 2 by observing that instead of working with 16 lists, we can work with 4 lists containing random 4-element sums, so L_0 will contain 4-sums of elements from l_0, l_1, l_2, l_3, L_1 4-sums of elements from l_4, l_5, l_6, l_7, etc. Solving SUM for L_0, .., L_3, solves it for l_0, .., l_{15}, because if y_0+y_1+y_2+y_3=0 and y_i \in L_i, then y_i=x_{4i}+x_{4i+1}+x_{4i+2}+x_{4i+3}, where x_j \in l_j.

Problem 3 is easily dealt with by observing that subtracting a constant from all elements of a single list is equivalent to solving SUM for nonzero sum. Indeed: x_1+x_2+x_3+(x_4-c)=0 implies x_1+x_2+x_3+x_4=c. Here, c is the target sum subtracted from list number 4.

Keygen sources are available here. 4sum.py is an easy to read implementation of the original algorithm from [2]. sum4.c is the actual keygen source. Should compile without any warnings with ‘make’.

0 – http://en.wikipedia.org/wiki/Birthday_problem
1 – http://en.wikipedia.org/wiki/Birthday_attack
2 – http://www.cs.berkeley.edu/~daw/papers/genbday-long.ps
3 – https://131002.net/siphash/
4 – http://en.wikipedia.org/wiki/Subset_sum_problem
5 – http://crackmes.us/read.py?id=611

  1. comdex
    14/01/2013 at 08:37

    hi, what tool do you use in order to draw a graph??

    • p_k
      14/01/2013 at 08:49

      This is a graph from wikipedia page, so I’m not shure, but it looks like a gnuplot graph.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: