[Gc] Generational collection performance disaster
rafe at cs.mu.OZ.AU
Wed Apr 6 01:01:08 PDT 2005
I'm one of the developers of the Mercury programming language. Right
from the beginning we've used the Boehm-Demers-Weiser garbage collector
(hereafter referred as boehm_gc) in the code generated by the Mercury
compiler and it's given us sterling service.
Recently we tried using the incremental/generational collection mode and
observed a *68% slow down* on a benchmark compiling a few large Mercury
modules. We changed the Mercury compiler to generate calls to an
altered version of GC_malloc_stubborn() that remembers the last stubborn
heap cell allocated and calls GC_end_stubborn_change() on that before
performing the new allocation. We did this to halve the number of GC
related function calls that would otherwise be necessary in the code
generated by the Mercury compiler. (Just to make everything clear,
these generated programs do call GC_enable_incremental() at start-up
and run in a single thread.)
So I have a few questions that I'd like to pose to the boehm_gc experts.
(1) What happened here with the performance?
(2) Mercury is a purely declarative language and as such Mercury
programs tend to allocate large numbers of small heap cells (I'd expect
the vast majority of them to be just two words in size) that, once
initialised, are never changed in any way (there are a few, rare
exceptions to this rule, e.g., for arrays). Is there any way boehm_gc
can/could be made to take advantage of this characteristic?
(3) It seems to me that boehm_gc could fairly easily be made to use
multiple generations (rather than just the two that incremental/
generational collection currently provides). I can post the outline of
the idea here if there's interest, but I'd like to see responses to (1)
and (2) before potentially embarrassing myself :-)
More information about the Gc