Re: glib hash memory footprint



On Tue, 2004-08-31 at 12:05, Sander Vesik wrote:
>  --- "Alan M. Evans" <ame1 extratech com> wrote: 
> > 
> > Also, linear probing has a hideous worst-case. The nominal-case
> > collision can be mitigated somewhat by using instead quadratic probing
> > or double hashing to avoid clustering, but the worst-case remains very
> > bad.
> 
> The worst case for chaining is actually worse than for linear probing in 
> practice considering that you are doing pointer chasing instead of array 
> access with linear probing. 

True if a linked list is used for separate chaining, as it is
implemented now. But replacing the linked list with an array allows the
implementation to avoid all the "next" pointers. For a bonus, if the
array is kept sorted, then chained lookups can be binary searches. (Even
with the linked list, it is possible to reduce the cost of lookups if
the list is kept sorted.) This type of optimization is not possible with
linear probing.

> > 
> > If table size is really a problem (and I agree that it can be, as for
> > example, in your case), wouldn't switching from linked lists to dynamic
> > arrays be a better solution in the general case?
> > 
> 
> yes, that would be nice but also a non-trivial problem and can give you 
> memory fragmentation quite easily

I don't know about it being non-trivial. The array solution seems pretty
trivial to me. You may be right about fragmentation, but I'm not certain
that it's an insurmountable problem.

I suppose that this is just my personal bias, but I don't think that
closed hashing is a good general purpose solution, given the requirement
to allocate a table larger than your data. (In fact, the table should be
at least twice the size of the data, or performance with linear probing
really takes a hit.) I'm certainly not saying that a closed hash table
is never good, but for the general case, I just don't always know how
big my data set is before I start.

-- 
Alan M. Evans <ame1 extratech com>




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]