Skip to content

Bugfix: ferite hamt did not handle hash collision#5

Open
cention-nazri wants to merge 9 commits intodarkrock:masterfrom
cention-nazri:bugfix-ferite-hamt-collision
Open

Bugfix: ferite hamt did not handle hash collision#5
cention-nazri wants to merge 9 commits intodarkrock:masterfrom
cention-nazri:bugfix-ferite-hamt-collision

Conversation

@cention-nazri
Copy link

The bug is reproducible as follows:

$ cat collision.fe
namespace Collision {
    function newsprint() {}
    function swig() {}
}

$ cache_ferite collision.fe   # CPU will be pegged at 100% while RAM usage increases to OOM

The hamt code tries to cache the two functions using the following keys (note the colliding uint32 hashes):

key                    uint64 hash             uint32 hash
Collision.newsprint_   5862143705074859073     189234241
Collision.swig_        17934375439221357633    189234241

Nazri Ramliy added 9 commits August 13, 2014 10:45
The goal is to be able to do something like this:

	$ make check

If something breaks

	$ prove test
If the need arises the we should choose a more mature tap producer [1].

[1] http://en.wikipedia.org/wiki/Test_Anything_Protocol#List_of_TAP_producers
Buggy behavior is shown by ferite_amt-hash-index-from-hash.c

The buggy code was:

    return (((index << (shiftAmount - AMT_SHIFT_AMOUNT)) >> 27) & 0x1F);

where shiftAmount is unsigned int and AMT_SHIFT_AMOUNT is 5.

The function was called successively with shiftAmount set to 32, 27, 22,
17, 12, 7, 2 (minus AMT_SHIFT_AMOUNT each time).

When shiftAmount is less than AMT_SHIFT_AMOUNT,

    shiftAmount - AMT_SHIFT_AMOUNT

becomes:

    (unsigned int)(2) - 5

which is a small unsigned integer that underflow to a very large
unsigned integer hence the buggy behavior.

Note that this bug has no effect on the behavior of the AMT as the value
is expected to be any value within 0-31 - the only visible effect is in
the actual placement of the hash items in memory.
This function is for when the hash bits are exhausted - the case when
the hashes of two different keys collides.
Upon hash bits exhaustion we use the bits from the key.

While this is not an optimal source of bits due to the non-random nature
of the ascii characters in the keys which results in non optimal use of
RAM, it should be good enough for use here under the assumption that
hash collisions are rare.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant