Home > Reverse Engineering > Solving Pimp crackme by j00ru and Gynvael Coldwind

## Solving Pimp crackme by j00ru and Gynvael Coldwind

I figured a nice tutorial would be more interesting than yet another “hello world” post, so here it goes — solution for Pimp crackme, a winning entry for Pimp my crackme contest (polish) by j00ru and Gynvael. You can download entires (all three of them) from the contest’s page. AFAIK I’m the only person who submitted a solution.

Let’s see what we are up against.

Looks nice :).

There are four columns consisting of four symbols. Each column has it own color, so that makes 16 unique symbols, easy math there :). Since serial’s length is also 16 characters, we can’t bruteforce a solution — total number of combinations is $16^{16}=2^{64}$, 64 bits is outside of our reach, unless you have resources comparable to those of disitributed.net at your disposal.

First thing we need to do is to identify a procedure that checks if the serial we provided is correct. Normally, we would look at cross-references (in IDA) for the message informing about success / failure, but in this case we can’t — crackme stores these messages as images, not strings, so we need to try something different.

Let’s try to follow execution from the beginning, to see how the main window is created, that should give us information about the window’s procedure and how it processes messages.

.text:004015BC                 push    offset stru_40D000 ; WNDCLASSEXA *
.text:004015C1                 call    RegisterClassExA
(...)
.text:00401625                 push    0               ; lpParam
.text:00401627                 push    ebx             ; hInstance
.text:0040162A                 push    0               ; hWndParent
.text:0040162C                 push    190h            ; nHeight
.text:00401631                 push    190h            ; nWidth
.text:00401636                 push    esi             ; Y
.text:00401637                 push    eax             ; X
.text:00401638                 push    90000000h       ; dwStyle
.text:0040163D                 push    offset WindowName ; "Pimp CrackMe"
.text:00401642                 push    offset ClassName ; "i"
.text:00401647                 push    0               ; dwExStyle
.text:00401649                 call    CreateWindowExA

Structure WNDCLASSEXA in this case is:

.data:0040D000 stru_40D000     WNDCLASSEXA <30h, 0, offset windows_proc, 0, 0,
0, 0, 0, 5, 0, \
.data:0040D000                              offset ClassName, 0> ; "i"

windows_proc is, as name suggests, the message handling procedure of the main window. Let’s look inside.

  if ( Msg == WM_PAINT )
{
if ( (unsigned __int8)thread_spawned ^ 1 )
{
}
hdc = BeginPaint(hWnd, &Paint);
StretchDIBits(hdc, 0, 0, 400, 400, 0, 0, 400, 400, lpBits, &bmi, 0, 0xCC0020u);
EndPaint(hWnd, &Paint);
return 0;
}

.text:00402A18                 mov     eax, [ebp+var_4]
.text:00402A1B                 mov     cl, byte_40D040[eax]
.text:00402A21                 movzx   eax, cl
.text:00402A24                 mov     edx, 0
.text:00402A29                 mov     ecx, 0Fh
.text:00402A2E                 sub     ecx, [ebp+var_4]
.text:00402A31                 shl     ecx, 2
.text:00402A34                 shld    edx, eax, cl
.text:00402A37                 shl     eax, cl
.text:00402A39                 test    cl, 20h
.text:00402A3C                 jz      short loc_402A42
.text:00402A3E                 mov     edx, eax
.text:00402A40                 xor     eax, eax
.text:00402A42
.text:00402A42 loc_402A42:
.text:00402A42                 or      [ebp+var_10], eax
.text:00402A45                 or      [ebp+var_C], edx
.text:00402A48                 inc     [ebp+var_4]
.text:00402A4B
.text:00402A4B loc_402A4B:
.text:00402A4B                 cmp     [ebp+var_4], 0Fh
.text:00402A4F                 jle     short loc_402A18
.text:00402A51                 mov     ds:serial_length, 0
.text:00402A5B                 sub     esp, 4
.text:00402A5E                 push    10h             ; Size
.text:00402A60                 push    0FFh            ; Val
.text:00402A65                 push    offset byte_40D040 ; Dst
.text:00402A6A                 call    memset
.text:00402A72                 sub     esp, 8
.text:00402A75                 push    [ebp+var_C]
.text:00402A78                 push    [ebp+var_10]
.text:00402A7B                 call    check_serial
.text:00402A83                 test    al, al
.text:00402A85                 jz      short loc_402AA4
.text:00402A87                 mov     ds:goodboy_flag, 1
.text:00402A8E                 mov     eax, 41F00000h
.text:00402A93                 mov     flt_40D034, eax
.text:00402A98                 push    0
.text:00402A9A                 call    sub_4013B0
.text:00402AA2                 jmp     short loc_402ABF

How did I knew we should focus on this part? That code just looks suspicious :). Part at the beginning is a loop with 16 (number of characters in serial) iterations, converting something from a buffer (byte_40D040) to a 64-bit number. Stepping through that code in a debugger reveals, that byte_40D040 holds characters of the serial number, “encoded” as follows:

0  4  8  c
1  5  9  d
2  6  a  e
3  7  b  f

So the green triangle from the first column is 0, and the purple triangle from last column is 0xC. Forcing path down to 00402A87 sets the “goodboy” flag and makes the crackme display congratulations. We are not interested in patching, as that would be too easy. Besides, authors explicitly forbid it.

We need to take a look at check_serial function. I’ll skip the part about how to actually understand the disassembly, since it’s a topic in itself, and move straight to the semantics. Here is an approximate decompilation of the serial-checking function.

//.text:004039AE
bool __cdecl check_serial2(int sn1, int sn2)
{
int param_ptr; // ST14_4@4
int v3; // ecx@19
signed int v4; // eax@19
bool v6; // [sp+0h] [bp-28h]@2
int v7; // [sp+8h] [bp-20h]@1
int opcode_ptr; // [sp+Ch] [bp-1Ch]@3
unsigned __int8 vm_opcode; // [sp+13h] [bp-15h]@4
int vm_argument; // [sp+14h] [bp-14h]@4
int vm_proc; // [sp+18h] [bp-10h]@4
int vm_result; // [sp+1Ch] [bp-Ch]@12
int vm_resulta; // [sp+1Ch] [bp-Ch]@23
int i; // [sp+20h] [bp-8h]@4

v7 = 0;
if ( sub_403950() == 0 )
{
v6 = 0;
}
else
{
sn1_copy = sn1;
sn_nibble = sn1 & 0xF;
*(_DWORD *)vm_reg4 = sn2;
opcode_ptr = 0;
while ( 1 )
{
if ( vm_code_size = 0 )
{
if ( vm_result > 0 )
{
if ( opcode_ptr + 5 * vm_result >= (unsigned int)vm_code_size )
return 0;
opcode_ptr += 5 * vm_result;
}
}
else
{
vm_resulta = -vm_result;
if ( opcode_ptr + -5 * vm_resulta < 0 )
return 0;
opcode_ptr += -5 * vm_resulta;
}
}
}
v6 = v7 == 8;
}
return v6;
}

This might be confusing at first glance, so let’s simplify it further.

// sn1 - first dword of our serial
// sn2 - second dword
// op_ptr is a pointer to current VM opcode
// vm_code is a table containing opcodes and arguments: [op1,arg1,op2,arg2,...]
// vm_reg4 - virtual register 4

op_ptr = 0;
vm_reg4 = sn2;
while(1){
vm_op = fetch_opcode(op_ptr, vm_code);
vm_arg = fetch_arg(op_ptr+1, vm_code);
vm_proc = fetch_proc(vm_op, handlers);

op_ptr += 5;

result = vm_call(vm_proc, vm_arg);

break;

if(vm_reg1 != constants[i])
return 0;

i++;
sn_part = trim(sn1, i);
}
else{
if(result >= 0){
if(result > 0)
if(op_ptr + result >= vm_code_size)
return 0;
}
else{
if(op_ptr + result < 0)
return 0;
}

op_ptr += result;
}
}

return (i==8);

Much better :). We see that there are two values with special meaning. One, 0xBADC0FFE is a signal to terminate the VM. Second one 0xDEADBEEF is a signal to compare the contents of one of the virtual registers (vm_reg1) against a constant from a table. If the comparision fails, serial is rejected (return 0).

If the value returned from vm_call isn’t one of the above magic constant, VM treats it as a jump displacement, by updating op_ptr (op_ptr += result). Before update VM checks if displacement is in correct range, as jumping after or before the vm_code buffer would result in executing trash 🙂

Notice that our 64-bit serial is split into two 32-bit parts: sn1 and sn2. sn2 is copied to VM_REG4, so it seems that only this part in mangled by the VM, as sn1 is used to set an additional variable sn_part, using function “trim”:

unsigned int trim(n, i){
int x, y;
x = 4*(i+1);
y = n<<x;
if(x & 0x20)
y = 0;
return (n & (y-1));
}

This will play a role later on.

Finally, last return statement (return (i==8)) tells us, that all 8 comparisions must succeed, otherwise crackme will reject our serial.

That doesn’t seem so bad, we just need to decompile the embedded VM code and reverse/bruteforce whatever computation it performs on sn2. We should expect some kind of trickery, since it’s unlikely that sn1 is simply ignored.

Let’s take a look at the VM handlers.

.data:0091FE20 handlers_procs  dd offset handlers_proc
.data:0091FE24 ; char handlers_op[]
.data:0091FE24 handlers_op     dd 1
.data:0091FE24
.data:0091FE28 off_91FE28      dd offset vm_mix_handlers
.data:0091FE28
.data:0091FE2C ; int handlers[]
.data:0091FE2C handlers        dd 0
.data:0091FE2C
.data:0091FE30 ; int dword_91FE30[]
.data:0091FE30 dword_91FE30    dd 0
.data:0091FE30
.data:0091FE34                 dd offset handlers_proc
.data:0091FE38                 dd 2
.data:0091FE3C                 dd offset vm_DIE
.data:0091FE40                 dd 0
.data:0091FE44                 dd 0
.data:0091FE48                 dd offset handlers_proc
.data:0091FE4C                 dd 3
.data:0091FE50                 dd offset vm_JMP
.data:0091FE54                 dd 0
.data:0091FE58                 dd 0
.data:0091FE5C                 dd offset handlers_proc
.data:0091FE60                 dd 4
.data:0091FE64                 dd offset vm_JZ
.data:0091FE68                 dd 0
.data:0091FE6C                 dd 0
.data:0091FE70                 dd offset handlers_proc
.data:0091FE74                 dd 5
.data:0091FE78                 dd offset vm_JNZ
.data:0091FE7C                 dd 0
.data:0091FE80                 dd 0
.data:0091FE84                 dd offset handlers_proc
.data:0091FE88                 dd 6
.data:0091FE8C                 dd offset vm_mov_hostReg1_reg1
.data:0091FE90                 dd 0
.data:0091FE94                 dd 0
.data:0091FE98                 dd offset handlers_proc
.data:0091FE9C                 dd 7
.data:0091FEA0                 dd offset vm_mov_reg1_hostReg1
.data:0091FEA4                 dd 0
.data:0091FEA8                 dd 0
.data:0091FEAC                 dd offset handlers_proc
.data:0091FEB0                 dd 8
.data:0091FEB4                 dd offset vm_fancy_call
.data:0091FEB8                 dd 0
.data:0091FEBC                 dd 0
.data:0091FEC0                 dd offset handlers_proc
.data:0091FEC4                 dd 9
.data:0091FEC8                 dd 0
.data:0091FECC                 dd offset vm_mov_mem_reg1
.data:0091FED0                 dd 3Ah
.data:0091FED4                 dd offset handlers_proc
.data:0091FED8                 dd 0Ah
.data:0091FEDC                 dd 0

Handlers aren’t obfuscated in any way, so it’s easy to identify what they do by just looking at them. For example:

.text:0040301E vm_DIE          proc far
.text:0040301E
.text:0040301E var_4           = dword ptr -4
.text:0040301E
.text:0040301E                 push    ebp
.text:0040301F                 mov     ebp, esp
.text:00403021                 sub     esp, 10h
.text:0040302B                 mov     eax, [ebp+var_4]
.text:0040302E                 leave
.text:0040302F                 retf
.text:0040302F vm_DIE          endp

.data:0091F740 vm_or_reg1_reg2 proc far
.data:0091F740                 mov     esi, 0FFF0h
.data:0091F745                 cmp     esi, 0FFFDh
.data:0091F74B                 jb      short loc_91F754
.data:0091F74D                 xor     eax, eax
.data:0091F74F                 xor     edx, edx
.data:0091F751                 retf    4
.data:0091F754 ; ---------------------------------------------------------------------------
.data:0091F754
.data:0091F754 loc_91F754:
.data:0091F754                 mov     ebx, [edi+0FFF0h]
.data:0091F75A                 mov     esi, 0FFF4h
.data:0091F75F                 cmp     esi, 0FFFDh
.data:0091F765                 jb      short loc_91F76E
.data:0091F767                 xor     eax, eax
.data:0091F769                 xor     edx, edx
.data:0091F76B                 retf    4
.data:0091F76E ; ---------------------------------------------------------------------------
.data:0091F76E
.data:0091F76E loc_91F76E:
.data:0091F76E                 mov     ecx, [edi+0FFF4h]
.data:0091F774                 or      ebx, ecx
.data:0091F776                 mov     [edi+0FFF0h], ebx
.data:0091F77C                 xor     eax, eax
.data:0091F77E                 xor     edx, edx
.data:0091F780                 retf    4
.data:0091F780 vm_or_reg1_reg2 endp

.data:0091F520
.data:0091F520 arg_0           = dword ptr  8
.data:0091F520
.data:0091F520                 mov     esi, 0FFF0h
.data:0091F525                 cmp     esi, 0FFFDh
.data:0091F52B                 jb      short loc_91F534
.data:0091F52D                 xor     eax, eax
.data:0091F52F                 xor     edx, edx
.data:0091F531                 retf    4
.data:0091F534 ; ---------------------------------------------------------------------------
.data:0091F534
.data:0091F534 loc_91F534:
.data:0091F534                 mov     ebx, [edi+0FFF0h]
.data:0091F53A                 mov     ecx, [esp+arg_0]
.data:0091F540                 mov     [edi+0FFF0h], ebx
.data:0091F546                 xor     eax, eax
.data:0091F548                 xor     edx, edx
.data:0091F54A                 retf    4

.text:004031B5 vm_JZ           proc far
.text:004031B5
.text:004031B5 arg_0           = dword ptr  0Ch
.text:004031B5
.text:004031B5                 push    ebp
.text:004031B6                 mov     ebp, esp
.text:004031B8                 mov     eax, ds:vm_reg3
.text:004031BD                 mov     eax, [eax]
.text:004031BF                 test    eax, eax
.text:004031C1                 jz      short loc_4031C8
.text:004031C3                 mov     eax, [ebp+arg_0]
.text:004031C6                 leave
.text:004031C7                 retf
.text:004031C8 ; ---------------------------------------------------------------------------
.text:004031C8
.text:004031C8 loc_4031C8:
.text:004031C8                 xor     eax, eax
.text:004031CA                 leave
.text:004031CB                 retf
.text:004031CB vm_JZ           endp

With all handlers identified, it’s easy to write a script for IDA that disassembles the VM code. Let’s take a look at the output (scroll down to get the script).

0:	[0a]	vm_mov_reg1_mem 0xfffc
1:	[06]	vm_mov_hostReg1_reg1 0x0
2:	[12]	vm_and_reg1_param 0xf
3:	[19]	vm_shl_reg1_param 0x6
4:	[09]	vm_mov_mem_reg1 0x10
5:	[07]	vm_mov_reg1_hostReg1 0x0
6:	[06]	vm_mov_hostReg1_reg1 0x0
7:	[0a]	vm_mov_reg1_mem 0x14
8:	[0b]	vm_mov_reg2_mem 0x10
9:	[0f]	vm_cmp_reg1_reg2 0x0
11:	[09]	vm_mov_mem_reg1 0x14
12:	[04]	vm_JZ 11
13:	[07]	vm_mov_reg1_hostReg1 0x0
14:	[08]	vm_fancy_call 0x1
15:	[14]	vm_not_reg1 0x0
16:	[08]	vm_fancy_call 0x3
17:	[0b]	vm_mov_reg2_mem 0x14
18:	[10]	vm_xor_reg1_reg2 0x0
19:	[08]	vm_fancy_call 0x2
21:	[17]	vm_rol_reg1_param 0x7
22:	[08]	vm_fancy_call 0x0
23:	[03]	vm_JMP -18
24:	[07]	vm_mov_reg1_hostReg1 0x0
25:	[01]	vm_mix_handlers 0x0
##########
0:	[1b]	vm_mul_reg1_reg2 0x0
1:	[02]	vm_DIE 0x14
2:	[14]	vm_not_reg1 0xfffc
3:	[12]	vm_and_reg1_param 0x0
4:	[04]	vm_JZ 4
5:	[17]	vm_rol_reg1_param 0xf
6:	[0e]	vm_xchg_reg1_reg2 0x6
7:	[02]	vm_DIE 0x10
9:	[12]	vm_and_reg1_param 0x0
10:	[14]	vm_not_reg1 0x14
11:	[09]	vm_mov_mem_reg1 0x10
12:	[07]	vm_mov_reg1_hostReg1 0x0
13:	[08]	vm_fancy_call 0x1
14:	[02]	vm_DIE 0x14
15:	[11]	vm_and_reg1_reg2 0xb
17:	[06]	vm_mov_hostReg1_reg1 0x5
19:	[05]	vm_JNZ 0
20:	[06]	vm_mov_hostReg1_reg1 0x7
21:	[13]	vm_or_reg1_reg2 0x0
22:	[06]	vm_mov_hostReg1_reg1 0x6
23:	[0a]	vm_mov_reg1_mem 0x0
24:	[06]	vm_mov_hostReg1_reg1 0x4

Second column is the instruction’s opcode. Numeric constant after each instruction is its parameter. Some instructions don’t take parameters, in which case the constant is 0.

First part of the decompilation looks nice — there is a loop with some arithmetic mangling, the usual thing you’d expect from a serial checking procedure in a crackme.

The second part looks wrong. There are premature VM_DIE instructions (we expect only one at the very end of code), jumps with nonsensical displacements, like vm_JNZ 0 (jmp \$) and no-parameter instructions with parameters, like vm_NOT reg1, 0xfffc.

Answer to this mystery resides in implementation of vm_mix_handlers instruction. Try to take a look at it yourself and guess what it does, the address is 00403039. You won’t be able to debug it (or any other handler), due to some clever trickery, already described in detail by j00ru here.

Here’s a decompilation:

def mutate(handlers):
for j in range(2, len(handlers)):
k = prng()
k = (k % 0x1B)+2
t = handlers[j]
handlers[j] = handlers[k]
handlers[k] = t
return handlers

def prng():
global seed #=sn_part = trim(sn1, i)

v0 = ((10009 * seed + 31337) % 2**32) % 100000007
seed = 5 * seed + 1337;
seed = seed % 2**32
return v0

As we can see, handlers are mixed using a pseudo random number generator (PRNG), seeded with sn_part (sn_part = trim(sn1,i)) and that’s why we got garbage instead of clean code during decompilation, apparently seed wasn’t correct. The nice thing about this scheme is how the seed is computed. Each time the PRNG is used, there are only 16 possible disassemblies, since after each vm_mix_handlers, only one nibble from sn1 is copied to sn_part (scroll up for implementation of trim).

It’s clear what we need to do. We should generate all 16 possible disassemblies and recognize the correct one automatically. I used only two heuristics: presence of premature VM_DIE instructions and presence of no-parameter instructions with params, like OP reg1, reg2, param.

I won’t paste the disassembled code, as it’s rather repetitive — there are few loops with arithmetic operations on sn2, nothing extreme :). The important thing is, there are three possible values of sn1, resulting in correct VM code being executed: 0x39845c67, 0xc9845c67, 0xd9845c67.

With first part out of the way, let’s work on the second one. After porting the decompiled code to C, we notice that reversing parts of the algorithm may be tricky:

inline int  vm_fancy0(int a1)
{
signed int v2; // [sp+4h] [bp-Ch]@1
int v3; // [sp+8h] [bp-8h]@1
unsigned int i; // [sp+Ch] [bp-4h]@1

v2 = 63689;
v3 = 0;
for ( i = 0; i <= 0x1F; ++i )
{
v3 = v2 * v3 + i * a1;
v2 *= 378551;
}
return v3;
}
inline unsigned int round0(unsigned int sn2, unsigned char *mem){
unsigned int r1, r2, hr1,tmp1,tmp2;

tmp1 = tmp2 = 0;
r1 = sn2;
hr1 = r1;
r1 = r1 & 0xF;
r1 = r1 << 6;
tmp1 = r1;
r1 = hr1;
while(1){
hr1 = r1;
r1 = tmp2;
r2 = tmp1;
r1 += 1;
tmp2 = r1;
if(r1-1 == r2)
break;
r1 = hr1;
r1 = vm_fancy1(r1);
r1 = ~r1;
r1 = vm_fancy3(r1);
r2 = tmp2;
r1 = r1 ^ r2;
r1 = vm_fancy2(r1);
r1 = rol(r1, 7);
r1 = vm_fancy0(r1);
}
r1 = hr1;
// mix_handlers 0

// cleanup
tmp1 = 0;
tmp2 = 0;

return r1;
}

vm_fancy* functions are similiar — they are all hard to reverse and loop the same number of times (32). There are eight rounds (8 functions similiar to round0), but porting them all to C isn’t really necessary. Since all of them take the same dword as an input (sn2), we can bruteforce only first two. Chance that a particular value passes first two checks, but not all eight is negligibly small, assuming all rounds are indepented and random bijections.

Bruteforce isn’t out of reach. To check all 32-bit values, we need to perform about $2^{5}*2^{6}*2^{32} = 2^{43}$ operations ($2^6$ is the counter of round0 function). That’s a lot, but fortunately I had access to a 23-processor machine, so bruteforcing took less than a day :).

## Solution

First dword is one of: 0x39845c67, 0xc9845c67, 0xd9845c67.
Second dword is: 0xaed6334b.

Fun facts

1. There are few 32-bit values for sn2 that pass more than one, but less than eight of crackme’s checks. That seems surprising, because assuming that all rounds are random and independent bijections, probability of finding a value that passes two checks is equal to $1\over{2^{32}}$. Observing so many “collisions” suggests, that sizes of images of these hash-like functions are significantly smaller than their domains.

2. With some dedication it might be possible to put the crackme in a inifinite loop. Since bad sn1 results in garbage code being executed, it is probable that one of the  disassemblies include an undonditional vm_JMP instruction with negative parameter. With some luck it could hang the VM (assuming there would be no premature vm_DIE instructions before vm_JMP and no jumps over it). I talked with j00ru about this, and this problem was anticipated, but protecting against it was considered to be an overkill, after all, it’s just a crackme, and bruteforcing for such a magic looping dword seems pointless :).

You can get the decompiler script and the bruteforcer from my github account here.

Categories: Reverse Engineering
1. 24/06/2011 at 23:51

What can I say… great work! 🙂

2. 24/06/2011 at 23:58

Nice work! Another approach to the 2nd part is to try ro generate SMT equations for the vm_fancy* functions and round0() and then use a solver like Z3 to find input values that generate the correct solution.

• 25/06/2011 at 00:08

@Mark:
SAT solvers are not so smart when faced with such problems, they are much slower than an optimized bruteforcer actually. Try using Z3 to find a “collision” for CRC32.

But yeah, solvers are under-utilized in RE, I think they may have many interesting applications 🙂

3. 30/06/2011 at 21:39

nice 😀
what about remaining two from the same contest? have you tried?

• 30/06/2011 at 21:41

Not really. Did you?

4. 01/07/2011 at 10:42

no, honestly I am a very beginner in RE. i just downloaded the tasks and now i am trying to do them, by following footsteps. that’s why I am asking, because I am curious to see how someone who is better than me did it. i will like to see the solution to the crackme with dll making, so if you know where to find the solution (if anyone did it) i would be thankful

• 01/07/2011 at 12:50

Bartek (secnews.pl) is going to publish solutions (soon?).
Which crackme are you talking about? How did you try to solve it, where are you stuck?

5. 02/07/2011 at 17:30

Just for the record, Bartek has put the source code of all submissions to public: http://www.secnews.pl/2011/07/01/rozwiazania-konkursu-pimp-my-crackme/

1. 01/07/2011 at 23:58