Re: g_str_hash
- From: Owen Taylor <otaylor redhat com>
- To: gtk-devel-list redhat com
- Cc: kenelson sequoia ece ucdavis edu
- Subject: Re: g_str_hash
- Date: 11 Feb 2000 15:05:08 -0500
Karl Nelson <kenelson@ece.ucdavis.edu> writes:
> >
> > Hi,
> >
> > g_str_hash is really slow. The Tcl string hash function is twice as
> > fast, literally. Here's a test program, and it also shows you the Tcl
> > function. Any objections to the Tcl version?
>
> Instead of switching how about just optimizing the one we have a
> bit?
>
> guint
> copy_of_g_str_hash (gconstpointer v)
> {
> const char *s = (char*)v;
> const char *p;
> guint h=0, g;
>
> for(p = s; *p != '\0'; p += 1) {
> h = ( h << 4 ) + *p;
> if ( ( g = h & 0xf0000000 ) ) {
> h = h ^ (g >> 24);
> h = h ^ g;
> }
> }
>
> return h /* % M */;
> }
>
> =>
>
> guint
> copy_of_g_str_hash (gconstpointer v)
> {
> const char *s = (char*)v;
> const char *p;
> guint32 h=0;
>
> for(p = s; *p != '\0' && !(h&0xf0000000); p += 1) {
> h = ( h << 4 ) + *p;
> }
> for(; *p != '\0'; p += 1) {
> h = h ^ ((h >> 24)&0xf0);
> h = ( h << 4 ) + *p;
> }
>
> return h /* % M */;
> }
>
> This is the exact same hash except operations have been moved to
> enhance speed. The difference being the new one keeps the upper bits
> and thus has less collisions and operates somewhere between the TCL and
> GLib on.
I suspect the appeal of the ASU one is solely that it is nice and
easily analyzable. I won't believe that it creates less collisions
until somebody shows me a set of (plausible) strings where that is the case.
> Also I suggest the << 4 go to << 3 so that you don't create the case
> annomally where Aaaaaaaaaaaaa and aaaaaaAaaaaaa are equivent. (The
> shift should be prime to the representation system.
>
> -O0 -O3
> Original system: 0.54 0.31
> Proposed system: 0.48 0.22
> TCL system: 0.31 0.12
>
> Is that enough improvement?
Well, if the Tcl (or the Perl one with the << 5) are as good, why
waste the time? And your optimization doesn't help for long strings,
which take the most time.
Regards,
Owen
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]