Leaking information using timing attacks on hash tables, part 2
This is the second part of Leaking information using timing attacks on hash tables, discussing implementation details of the leak. Read the first part for a high level overview.
Implementation details
Consider a hash table using double hashing [1]. Checking if a key belongs to a hash table T can be expressed as:
def lookup(T, k): h1 = hash1(k) h2 = hash2(k) idx = h1 while 1: if is_free(T[idx]): return False #structure holding (k,v) s = T[idx] if s.k == k: return True idx += h2 idx = idx % size(T)
Note this can’t loop if always returns something relatively prime to
and
is never 100% full (FF keeps both of these conditions true).
Firefox is nice enough to store certain kind of pointers and user supplied integers in the same hash table. All of the magic happens in this function:
file: js\src\jsscope.cpp function: Shape ** PropertyTable::search(jsid id, bool adding)
is an abbreviation of T.search(k, false).
In order to layout keys in as described in part 1:
we set properties of a javascript object. JS code:
obj[x] = 1;
translates to and if
is an integer, we have full control on where exactly in
it’s going to be stored. The hash functions used by FF are (all operations modulo
):
def h(x): return x*GOLDEN def hash1(x, shift): return h(x)>>shift def hash2(x, log2, shift): return (h(x)<<log2)>>shift
where is an odd constant (golden ratio),
,
=
.
In order to set , we need to find
s.t.
=
. Since
is relatively prime to
, it has a multiplicative inverse [2]
. Solving for
yields
. Now the
factor 😉 would be
if integers passed from JS were not mangled, but unfortunately they are — instead of
, the key being added to the table is
(this is related to how FF tags [3] pointers). To accomodate for this, set
and let
Now obj[x’]=1; will result in setting at position
, but that equals
The equality holds, because if
,
must be odd (since both
and
are odd).
Using these formulas, we fill (mixing python and js pseudocode):
obj = {} for i in range(N): x' = calc_x(i) obj[x'] = 1 #this results in T[i] being set
How to use this crafted table to leak information?
Invoking obj.hasOwnProperty(str) is equivalent to , so if we make enough JS objects (strings in this POC) we will eventually find one that takes considerably longer to check than others. Let’s call this object
. Recall that running time of
is proportional to the length of
‘s chain, so longer checking means longer chain.
Using the technique described in part 1, we can learn elements belonging to ‘s chain. Let’s express them as
(
,
).
Part 1 describes how to recover with high probability, by computing the
of a specific set. We will prove that the “exponentially low probability for failure” claim is correct and discuss a case where the described algorithm can fail.
Let and let
.
is a set of differences of
randomly chosen elements from
. Let
(note that
). What’s the probability that
?
An equivalent question is to ask about probability of for
. To see that, consider what happens if we divide all elements of
by
, or multiply all elements of
by
.
Nymann asked himself the same exact question in 1972 🙂 and proved [5], that the probability of random integers being relatively prime is
, where
is the Riemann’s zeta function [6]. I think it’s pretty cool that zeta made its way to a blog post about hashtables in Firefox :D. Yet another grave reason to solve the related millenium problem [8] ;).
It’s easy to see that exponenets work in our favor. Since , then
, so
and
. Probability that
approaches
exponentially fast in terms of
.
The above reasoning works when the sequence of elements of ‘s chain is monotonic, meaning that
was small enough to never “wrap”
around
. There’s a second case, which looks like this, when plotted on a graph:
Elements are scattered around both ends of the filled region. This happens when is big enough to “jump over” the free region at the end of the hashtable.
The in that case is most likely going to be equal to 1, but with
. There’s a fast way to deal with this problem.
Let , so that we can express differences of elements of
‘s chain as:
mod
.
.
mod
Notice that the range of is
. Low end with
,
and high end with
,
.
Knowing that, we can iterate over all possible values of , solve the equation
for
(using the linear congruence theorem [7]) and then compute:
mod
.
.
mod
If we found was correct, then
. Probability that this condition is true, but
is incorrect is (again) exponentially low.
Part 1 discusses how to find the chain’s starting point , knowing a single element of the chain and its period
.
Having and
, we reconstruct ptr(Mstr) exactly:
Defining info leaks
As we can see, leaking information is not the same thing as reading memory. My proposition is to define “leaking” as any procedure leading to diminishing the number of possible states the attacked program might be in. Let’s say a program has 2 bits of memory. Let’s assume we learned (by any means) that
xor
(
is the memory). This condition allows only
,
or
,
, so the number of possible states is 2, instead of 4. “Memory disclosure” definition fails to account for these “partial” leaks and side channel leaks, while the “entropy” one doesn’t.
References
1. http://en.wikipedia.org/wiki/Double_hashing
2. http://en.wikipedia.org/wiki/Modular_multiplicative_inverse
3. http://en.wikipedia.org/wiki/Tagged_pointer
4. http://en.wikipedia.org/wiki/Linear_regression
5. J. E. Nymann. “On the Probability that k Positive Integers Are Relatively Prime.” J. Number
Theory 4 (1972):469-73.
6. http://en.wikipedia.org/wiki/Riemann_zeta_function
7. http://en.wikipedia.org/wiki/Linear_congruence_theorem
8. http://en.wikipedia.org/wiki/Millennium_Prize_Problems#The_Riemann_hypothesis
This is really really cool. Leaking information about memory locations without *reading* memory values directly is mostly unexplored outside of cache timing side channel crypto literature (at least when I did my search it was). At Blackhat DC 2010, presented an info leak attack using the ordering of elements in a heterogeneous hashtable (mixed heap addresses and integers) exposed via a hashtable iterator. I also tried to get the academic community interested in this at WOOT 2010. I don’t think this is the last we’ll see of this kind of attack.
Anyway, VERY cool! Well done.
I remember your trick — I actually tried to use it against FF, but foreach() loops in JS use “time of addition” for ordering, instead of ptr values 🙂