Re: g_str_hash



On 11 Feb 2000, Havoc Pennington wrote:

> Of course this function can just be:
> 
> {
>  const gchar* s = (char*)v;
>  guint result = 0;
>  while (*s)
>   {
>     result += (result << 3) + *s;
>     ++s;
>   }
>  return result;
> }
> 
> And it's even microscopically but measurably faster that way. Don't
> ask me what's up with the structure of the original version's loop.

the tcl hash function is not so good as a general purpose string hash
function in that it multiplies the hash value by 9 instead of an odd
number marginally bigger than 26, e.g. 33 like the perl one does.
i discussed this stuff to some extend with owen back in 1998, basically
he was arguing the perl function to be faster because it onlly did
a shift and two additions, while i was claiming that you get less lossage
with the ASU one (which is in glib) because the higher bits are constantly
fed back into the lower bits of the hash.

it of course depends on what kind of string you want to hash, for
strings where you mostly care about the tails, you're certainly willing
to trade some of the higher bits and thus allow hash collisions for
strings that only differ in the beginning characters.

for a general purpose string hasher, i'm not so sure, maybe we should
just leave the decision to the hash table user by providing:

guint
g_str_hash (gconstpointer string)
{
  register const gchar *p = string;
  register guint h = *p;

  for (p = string + 1; *p; p++)
    h += (h << 4) + *p;

  return h;
}

and for high-bit feedback:

guint
g_str_hash_long (gconstpointer string)
{
  register const gchar *p = string;
  register guint h = *p;

  for (p = string + 1; *p; p++)
    {
      register guint g;

      h = (h << 4) + *p;
      g = h & 0xf0000000;
      if (g)
        h ^= g >> 25;
    }

  return h;
}

which will give hash speed ups for all the places where
g_str_hash() is currently used, and hopefully also improve
hashtable performance, based on the assumption that for
most cases it'll be the string tail that matters.

> 
> Havoc
> 

---
ciaoTJ



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]