[Gc] Understanding the performance of a `libgc'-based application
hans.boehm at hp.com
Mon Nov 27 12:29:32 PST 2006
Can you generate an execution profile of the libgc-based Guile, perhaps running GCBench?
It sounds to me like you're basically doing things correctly. But the timing results are not what I would expect.
If you run GCBench with the standard Guile, presumably there are essentially no malloc calls involved, since essentially all the objects are small?
I think the normal GCBench parameters are large enough that 5000 disappearing links shouldn't make a huge difference. It should be apparent from a profile whether I'm right. They are relatively slow, but the time per disappearing link and collection is still constant.
Are you using a pre-built GC package, or did you build it? If you are using a standard thread-safe build, without thread-local allocation, the GC allocator will end up spending a lot of its time acquiring and releasing the allocation lock, especially on a Pentium4-based machine. If you are comparing it to a purely single-threaded Guile collector, that could easily account for the difference. (GC7 may largely fix that, if that's indeed what you're seeing.)
> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Ludovic Courtès
> Sent: Monday, November 27, 2006 1:26 AM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] Understanding the performance of a
> `libgc'-based application
> I started porting the Guile interpreter to `libgc' some time
> ago. The port is now complete feature-wise, but it is almost
> twice as slow as the original Guile (e.g., when running
> `GCBench.scheme'). I investigated various tweaks without
> much success so far. So I'm looking for feedback to see if
> I'm missing some obvious optimization opportunity, or at
> least to understand those performance characteristics.
> In Guile, all objects are implemented as a cell (two words)
> or double cell (four words) which may contain pointers to
> additional data (e.g., string buffers, bignums, etc.). With
> the current GC, cells are allocated from the "cell heap"
> while other data (string buffers, etc.) are allocated using
> libc's `malloc ()'. Guile's GC uses a mark and sweep
> algorithm. It scans the stack and the cell heap for live
> pointers. During the mark phase, the GC also invokes the
> various type-specific mark procedures. When an unmarked cell
> is encountered during the sweep phase, it marks it as free
> and calls `free ()' or a type-specific finalizer on its
> associated data, if need be.
> In the `libgc'-based Guile, `malloc ()' is no longer used.
> Instead, all data is allocated using either `GC_MALLOC_ATOMIC
> ()' or `GC_MALLOC ()'
> (the former is used for string buffers and file contents).
> Thus, Guile code no longer needs to provide custom mark/free
> Furthermore, "all interior pointers" is turned off and the
> few valid displacements are registered.
> Guile uses weak hash tables extensively and those are
> implemented as disappearing links in the `libgc'-based
> Guile[*]. Running Guile up to the prompt with
> `GC_DUMP_REGULARLY' shows that it uses around 5000
> disappearing links. Is this a likely source of performance loss?
> I would appreciate comments and suggestions.
> [*] Speaking of this, I support Laurynas Biveinis's proposal to allow
> for values other than NULL (see
> Gc mailing list
> Gc at linux.hpl.hp.com
More information about the Gc