Proposal for a new ghash table

From: "Emmanuel DELOGET" <logout free fr>
To: <gtk-devel-list redhat com>
Subject: Proposal for a new ghash table
Date: Fri, 26 Nov 1999 12:54:12 +0100

Hi everibody.

I've just looked at the code of the current ghash

(i was wondering what was the underlying code

for the hash table resizing). Here is another

approach for a dynamic hash table, based upon

an article or Per-Ake (Paul) Larson (can't find the

reference of this article). I will code it in the

next days, and will post results here.

The Larson's algorithm does not introduce any

freeze/thaw statements. It increases the size

of the hash table in order to limit the size of the

hash node list. Basically, it starts with a 2^n

size. When the number of hash node is bigger

than its current size, the size of the hash table

grows to 2^(n+1). This happens on the insertion

of a new hash node. then it computes the new

hash key and insert the node at the right place.

The key problem is the hash table lookup.

Since the size of the table grows and the

since the nodes are not moved in the table

(well... you can move the nodes too, see later),

it is possible to not find the node on the very

first look up. Consider a table with a base size

of 4 that has gown to have have a size of 32.

The hash val of the first element you have added

was 123456. In a table of 4 nodes, the resulting

hash val is 0. In a table of 32 nodes, the resulting

hash val is 8. In order to find the correct key for

a particular node, you'll have to

1) compute the current hash value V from the

key K (store it as V1)

2) if the node is found, return it.

2) if not, find the greater n value defined by

2^n < V

4) compute the new K value

5) go to step 2

Before returning, a good thing should be to move

the node to position V1 if V != V1. The next time

you'll want to access to the node, the look up

will be faster.

Of course, an extra thaw function can be provided

to reorganize the entire hash table for faster

look up when all the elements have been added

to the table...

The table is not collapsed when the number of

node decreases.

[the following assumes that the base size is 1 -

this is only assumed to show the algorithmes]

In order to minimize the allocations, the table is

not entirely reallocated when its size changes.

you simply use an array of 32 tables (looks like)

g_hash_node **hash_tables[32];

The insertion code looks like (pseudo-code):

node = create node(key, value);

if (nbnodes > current size) {

current_array++;

hash_tables[current_array] =

allocate new table(current_size);

current_size *= 2;

}

hash_val = compute_val(node, key) % current_size;

cur_size = current_size / 2;

cur_array = current_array;

while (cur_size > hash_val) {

cur_array--;

cur_size /= 2;

}

add_node_to_array(node,

hash_tables[cur_array],

hash_val - cur_size);

The look up code looks like :

node = NULL;

cur_loop = 0;

cur_size = curent_size;

V = V1 = compute_val(node, key) % current_size;

while (cur_loop <= current_array) {

if (node is found at pos V) break;

n = (int)log2(V);

cur_size = 2^n;

V = compute_val(node, key) % cur_size;

cur_loop++;

}

if (node) {

if (V1 != V) move_node_from_to(node, V, V1);

}

return (node);

To be found at position V in the whole table, a

node should be at pos (V - cur_size) in table

hash_tables[log2(cur_size)].

Hope this is of interest.

Yours,

Emmanuel

Follow-Ups:
- Re: Proposal for a new ghash table
  - From: Emmanuel DELOGET
- Re: Proposal for a new ghash table
  - From: Kaz Kylheku

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]