Re: garbage collection experiment



Hi,

Observations after further testing and discussion with Owen:

 - seems to work great with gtk-demo, even if I add a "malloc bomb"
   idle function that uses 100% CPU constantly leaking memory; memory
   usage of the process doesn't go up more than the non-GC case.

 - --enable-gc-friendly may help a lot with GC, if people were going
   to use a thing like libggc as an application add-on with any
   frequency, GLib should probably default to gc-friendly since apps
   can't depend on specific glib build options. --disable-mem-pools
   can also be a good idea in a GC environment, but potentially more
   costly in the non-GC environment.

 - At the moment, refcounted objects don't get collected
   until refcount == 0 because someone still has a pointer to the
   object in order to call unref(). If you actually want to take 
   advantage of GC, of course you wouldn't keep that pointer, 
   so stuff would get collected with refcount > 0; this means in 
   essence that finalizers don't get called, so in any case that 
   a finalizer does more than call free()/unref() on sub-objects,
   things will break.

   Anyhow, doing more than free/unref in a finalizer is conceptually 
   broken even now, but broken in practice with a collector most 
   likely, since finalizers will just get skipped.

   Boehm has hooks to register a finalizer for a block, but these
   can't really be used for GObject; one issue is that it can't
   collect cycles involving finalized objects, because normally if A
   points to B, it finalizes A then B, but if A and B are in a cycle,
   there's no safe finalization order. Another issue is that
   finalizers slow stuff down and add overhead. Finally most objects
   won't expect to be inside g_malloc() when finalized, which is where
   the collector will do the finalization.

   It might be possible to have a Boehm-registered finalizer that
   simply adds an object to a list of objects, then you call
   GObject-level finalizers in some kind of idle handler, which keeps
   finalizers from being called from inside a g_malloc() and avoids
   some reentrancy problems; but, there are other issues that arise
   with that.

   Basically to make GC useful with refcounted objects, finalizers
   can only free/unref and things should work if the finalizer never
   gets called. The simplest solution all around. It does 
   imply that a GObject can't hold a pointer to memory not from 
   g_malloc().

 - Boehm GC has a malloc_block_that_doesnt_contain_pointers()
   function; the collector can then avoid scanning that block 
   for references to other blocks. It may be worth modifying 
   GMemVTable to support this, and use it in strategic places
   (string functions, e.g. strdup, and pixbuf buffers are two 
   easy ones that would give you a performance boost)

 - That said, there are no noticeable performance problems on my
   machine when running gtk-demo and the malloc()-bomb idle 
   eating 100% CPU and continuously allocating stuff. The GUI doesn't 
   hang for GC (even though the GC is not incremental). I do have 
   a Duron 850, but minus the idle function I'm sure an average app
   would have no problems even on lame machines.

 - I can't get the incremental GC feature of the Boehm collector to 
   make any difference; it always does full collections anyway for 
   some reason. I guess it's good the full collector is fast enough,
   the incremental collector has some weird issues (it has to 
   be able to detect the dirty bit on pages, done by various 
   platform-specific hacks that don't always work)

 - There are certainly portability issues with the Boehm collector, in
   particular right now it just punts on shared libraries and only
   builds a static lib, and I'm using an unportable hack to get it
   into a shared lib. The gcj team has it autotoolized and Tom says
   Boehm may merge that upstream sometime. Of course the collector
   implementation itself is a huge nest of platform dependent hacks,
   unavoidably, but it's ported to most platforms.

The only real showstopper I see for using this in real apps is the
issue of finalizers. You could use it for real apps now if you do
manual refcounting on refcounted objects, and only use the GC for
strings and such.

Havoc






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]