Re: glib hash memory footprint

From: Davi de Castro Reis <davicastro terra com br>
To: Keith Packard <keithp keithp com>
Cc: gtk-devel-list gnome org, Sander Vesik <sander_traveling yahoo co uk>, "Alan M. Evans" <ame1 extratech com>
Subject: Re: glib hash memory footprint
Date: Tue, 31 Aug 2004 23:49:10 -0300

Hi all,

First of all, thank you for all the feedback. I think most of you can goon for a long time about the advantages and disadvantages of differenthash implementations. But probably everybody agrees that there is not asingle approach that is good for every case.

Let me discuss some of the options and give my opinion on how we couldget this done. There are several options:

1. Current linked list implementation: this is simple, readable andgives good performance, but has a huge memory penalty

2. Dynamic arrays: probably better than linked list, but I don't thinkit is worth it. First, you still have a memory penalty for the size ofthe arrays of for a array terminator. Furthermore, allocating memoryblocks at will might result in a very fragmented and slow to free memory.

3. Linear probing: this is what I want. It is simpler than doublehashing and could be easily changed for quadric probing. Memory overheadis very small, and if you can tune your hash function and expose theg_hash_resize() call to "close" the hash, you can get very good results.By "close" I mean doing mostly (exclusively) lookup operations.

4. Double hashing: another good option that someone might want to give atry.

Since there are too many options and too many tradeoffs/scenarios, mysuggestion would be to go with a polymorphic implementation using aunion. So you create your hash using g_hash_table_new (linked list),g_hash_table_linear_probing_new(), g_hash_double_hashing_new(),g_hash_array_new() and so on. Then you simply feed the returned pointersto the common methods (ex: g_hash_lookup()) and let the first byte ofthe union demultiplex the correct call.

What I want to implement is the linear probing approach, but I can leavethe hooks for those that want others approches. The advantage is thatrunning code will not be affected at all. Do you like this idea?


Thanks,
Davi de Castro Reis

ps. I think that the problem of reserving UNUSED/UNITIALIZED values forthe linear probing approach, letting NULL->NULL entries can easily besolved using and if for those reserved values.

Follow-Ups:
- Re: glib hash memory footprint
  - From: Sander Vesik
- Re: glib hash memory footprint
  - From: Owen Taylor

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]