[Gc] Generational collection performance disaster
hans.boehm at hp.com
Wed Apr 6 11:02:17 PDT 2005
I don't have a real answer, but I do have some relevant observations:
- Incremental/generational collection is expected to lose performance
in some cases. (Generational collection loses performance sometimes
even for copying collectors, but that's much less common.) But I
wouldn't have expected this to be an issue for
a non-lazy declarative language. It might be worth checking where
the protection faults are being generated, i.e. which updates are
- It's also probably worthwhile to look at some GC logs both ways.
It may be that generational GC has the unintended consequence of
running in a smaller heap. The sizing heuristics are a bit different,
but are intended to give similar results.
- The current incremental GC depends on signal delivery performance,
among other things. I assume this was on Linux? Its signal delivery
performance is normally pretty good.
- Stubborn allocation support was changed completely in 7.0. This is
implemented a bit more completely in my 7.0 tree than it is in
7.0alpha1, but I'm sure it needs work. Instead of stubborn allocation,
there is support for explicit write barrier calls. Objects directly
reachable from the stack are viewed as dirty, so you shouldn't need
those calls for a functional language. If someone wants to make this
work for real, I can make available a newer "experimental" version.
- Multiple generations were tried a long time ago. At the time, the
added overhead of keeping track of additional per-object bits seemed
to outweigh the benefit of avoiding premature promotions. Given that
it increasingly makes sense to reserve a whole byte for each mark bit
for concurrency reasons, there may be more room for clever
implementations in this area now.
> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Ralph Becket
> Sent: Wednesday, April 06, 2005 1:01 AM
> To: Boehm GC List
> Subject: [Gc] Generational collection performance disaster
> I'm one of the developers of the Mercury programming
> language. Right from the beginning we've used the
> Boehm-Demers-Weiser garbage collector (hereafter referred as
> boehm_gc) in the code generated by the Mercury compiler and
> it's given us sterling service.
> Recently we tried using the incremental/generational
> collection mode and observed a *68% slow down* on a benchmark
> compiling a few large Mercury modules. We changed the
> Mercury compiler to generate calls to an altered version of
> GC_malloc_stubborn() that remembers the last stubborn heap
> cell allocated and calls GC_end_stubborn_change() on that
> before performing the new allocation. We did this to halve
> the number of GC related function calls that would otherwise
> be necessary in the code generated by the Mercury compiler.
> (Just to make everything clear, these generated programs do
> call GC_enable_incremental() at start-up and run in a single thread.)
> So I have a few questions that I'd like to pose to the
> boehm_gc experts.
> (1) What happened here with the performance?
> (2) Mercury is a purely declarative language and as such
> Mercury programs tend to allocate large numbers of small heap
> cells (I'd expect the vast majority of them to be just two
> words in size) that, once initialised, are never changed in
> any way (there are a few, rare exceptions to this rule, e.g.,
> for arrays). Is there any way boehm_gc can/could be made to
> take advantage of this characteristic?
> (3) It seems to me that boehm_gc could fairly easily be made
> to use multiple generations (rather than just the two that
> incremental/ generational collection currently provides). I
> can post the outline of the idea here if there's interest,
> but I'd like to see responses to (1) and (2) before
> potentially embarrassing myself :-)
> -- Ralph
> Gc mailing list
> Gc at linux.hpl.hp.com
More information about the Gc