Re: g_slice_



On Sat, 3 Dec 2005, Morten Welinder wrote:


I might be dense, but what exactly does g_slice_ buy us over
plain, old malloc?

* Don't say speed.  That's a bogus argument because you cannot
 possibly have tested all mallocs.

speed. simply because for some (glibc) tested mallocs, g_slice is
faster, and on many systems malloc is known to be a lot slower than
the glibc malloc.

* Don't say overhead for the same reason.

overhead. per page (4096 bytes on ia32), we're currently using 24
bytes for administrative purposes (8 of those are left for the
systems malloc). the rest of the page is divided into chunks of
the required size, and the reminder is used for cache coloring.
so taking only the administrative strcutures into account, you're
better off than malloc which needs an extra 4byte size tag for
each block, taking the reminder into account, you may end up worse
if comparing only to mallocs size tag. but then, glibc malloc
provides cache coloring as well and i don#t know the overhead of
that, though i'd expect it to be somewhat around the figures used
by g_slice.

What do we pay?


one thing you should take into account is the wastage and speed issues
the old mem-chunk code had. most importantly, g_slice is about replacing
memchunks as described in
  http://bugzilla.gnome.org/show_bug.cgi?id=118439
  (and the dups linked from there)

* We have probably killed tools like Purify that know about malloc
 specifically.

there is nothing worse in using g_slice over using memchunks here.

* We have more overhead now since both malloc and g_slice_ will
 keep their own free pools.

we have a lot less overhead now, because each memchunk was keeping
its own free pool. with gslice, we only have one global depot left.

* More code (now malloc and g_slice_), both likely to be kept
 hot by the cpu cache.

trade this against the old memchunk code, and you end up +-0.


I just don't see it.  Trying to outdo an unknown malloc is hubris.

If you had wanted something marginally fancy, you could have taken
goffice's pool allocator.  It comes with:

* Free leak detection support.  You can iterate over leaked pieces.
* You can free a whole pool without freeing the pieces (but then
 you cannot detect leaks, of course).
* O(1) ability to give back memory to the system when chunks become
 free.

if you really want to understand the issues at hand, you'll have to read
up the bugreports i pointed you at and take a look at what the old memchunk
code does.
and if you seriously want to argue one allocator over the other,
you should read the slab allocator papers first:
  [Bonwick94] Jeff Bonwick, The slab allocator: An object-caching kernel
  memory allocator. USENIX 1994, http://citeseer.ist.psu.edu/bonwick94slab.html
  [Bonwick01] Bonwick and Jonathan Adams, Magazines and vmem: Extending the
  slab allocator to many cpu's and arbitrary resources.

maybe that'll already answer some of the questions you have.

For better [speed] or worse, it isn't currently thread safe.  That
could be trivial to add, though.

making an allocator thread safe is definitely non-trivial, the second paper
[Bonwick01] has the details on that. just to mention a few issues involved:
- you don't want to pay lock+unlock per every single allocation
- if you use n CPUs (n>1), you want to be able to allocate n times the
  amount of memory of a single CPU per time interval, and not just be safe
  against concurrent allocations
- if you consider one allocator per thread, take into account that chunks
  may be allocated in one thread and freed in another
- you want to avoid false sharing, where different CPUs keep accessing
  different chunks which lie in the same cache line, this paper has
  more information on that:
  [Berger00] Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe,
  and Paul R. Wilson, Hoard: A Scalable Memory Allocator for Multithreaded
  Applications. November 2000, http://www.cs.umass.edu/~emery/hoard/


Morten


Unrelatedly,

why drop type-safety for the non-gcc case?

getting type safe arguments right in macros is non trivial, especially
across multiple compilers. i already had to adapt the initially comitted
logic simply due to differences in gcc versions.

Something along these
lines ought to work:

#define g_slice_free(type, mem) \
 do { \
   if (1) g_slice_free (type, mem); else (void)((type*)0 == mem); \
 } while (0);

thanks, i'll look into using that for g_slice.


In the gcc case you need to supply a non-macro definition of g_slice_free.
Otherewise the resulting program is not valid, regardless of that
"while (0)".

sorry, i don't understand you here. what do you mean by "not valid" and
what version of the gcc case are you referring to (it'd help if you pasted
the relevant lines)?

---
ciaoTJ



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]