[Gc] Re: Understanding the performance of a `libgc'-basedapplication
ludovic.courtes at laas.fr
Sun Dec 3 12:42:19 PST 2006
"Boehm, Hans" <hans.boehm at hp.com> writes:
>> > A partial solution is probably to use the thread-local allocation
>> > facility. For 6.8:
>> > 1. Make sure the collector is built with THREAD_LOCAL_ALLOC
>> > and make sure that GC_MALLOC and GC_MALLOC_ATOMIC (all
>> caps) are used
>> > to allocate. (I'm afraid the collector you used for the profile is
>> > not built with THREAD_LOCAL_ALLOC. IIRC, that would cause the
>> > collector to switch to pthread locks, and would probably cause your
>> > test to run even slower.)
>> > 2. Define GC_REDIRECT_TO_LOCAL and then include gc_local_alloc.h
>> > before making any of the above calls.
>> I did so, but the resulting code always segfaults when trying
>> to access thread-specific storage (see below).
> I neglected to mention that you probably need to explicitly call
> GC_init() (in GC7, it should be GC_INIT()).
Indeed, this fixes the problem (so I end up calling `GC_INIT' then
`GC_init', which looks a bit weird ;-)).
Performance-wise, `THREAD_LOCAL_ALLOC' does yield a noticeable
improvement. This brings the libgc-based Guile close to Guile (or even
slightly faster) when running GCBench. The best results are obtained by
compiling libgc with `--enable-threads=posix' (thus
`THREAD_LOCAL_ALLOC'), _without_ `--enable-parallel-mark', and with
`USE_COMPILER_TLS=1' (although this one doesn't seem to make a big
I'll make further measurements on applications other than GCBench now.
Thanks for your help!
More information about the Gc