[Gc] RE: how to prevent mark stack overflows in GC_push_other_roots
hans.boehm at hp.com
Mon Sep 12 14:52:44 PDT 2005
The logic behind this is:
0) We have to be able to recover from running out of memory/swap space
during marking. If a process does run out of memory allocating small
objects, this is often where the problem shows up. We end up allocating
all available memory, and then forcing a collection. The collection has
to deal with a larger heap than it has seen before, and thus often needs
a larger mark stack. But it can't be grown, because we just allocated
all of the heap. (There is some argument that a Unix process can't
reliably recover from running out of memory anyway, and I think in
practice that's true. But based on experience, there seems to be a
strong preference for 98% reliability over, say, 60% reliability in
dealing with the situation.)
1) There are two methods for dealing with roots:
Eager: Scan the root section immediately, mark any objects
referenced, and push the contents of the objects onto the mark stack if
they haven't been previously marked. It is hard to bound the number of
entries we push onto the mark stack this way. But an overflow is
fundamentally OK, since it can only occur after we have marked objects,
thus ensuring forward progress.
Lazy: Push the entire root section with GC_push_all, without
marking anything. Overflowing the mark stack while doing this many
times is bad. If we have to drop these and start over due to an
overflow, we've made no progress.
2) We first perform a bounded number of lazy pushes, and then do an
unbounded number of eager pushes. The lazy pushes can't possibly
overflow the mark stack. It is OK if the eager ones do, since we can
handle that. If GC_push_all ran into a mark stack overflow, it aborts,
since something went seriously wrong, and we might otherwise end up in
an infinite mark loop.
In fact, we now do a bounded number of eager pushes early, as a result
of eagerly marking part of the GCs own stack. But I think that's still
OK. (Looking at the code, I think there is at least a theoretical issue
with the way the IA64 register backing store is handled. I'll have to
I gather that you can have an unbounded number of regions, or at least
contiguous region sections? Or you call GC_push_all_stack on each
object? The crashes occur during the GC_push_other_roots call itself?
You might try calling GC_push_all_eager() instead, and making sure that
this occurs after the stack contents are pushed. Especially if you have
a large number of pointers into the garbage-collected heap, or if these
regions are large relative to the garbage collected heap and you are
using the parallel collector, this may turn into a performance problem.
But at least it should be correct.
> -----Original Message-----
> From: Dan Bonachea [mailto:bonachea at cs.berkeley.edu]
> Sent: Sunday, September 11, 2005 3:02 AM
> To: Boehm, Hans; gc at napali.hpl.hp.com
> Cc: Imran Haque
> Subject: how to prevent mark stack overflows in GC_push_other_roots ?
> Hi Hans - we're encountering crashes with "unexpected mark
> stack overflow"
> errors in some Titanium programs (we're currently using the
> 6.5 collector),
> and I'm trying to figure out the right way to solve them.
> The basic design issue is Titanium includes an alternate
> memory allocation
> system which provides region-based explicit memory
> management. We acquire the
> space for our regions as raw pages from GC_unix_get_mem(),
> but otherwise the
> collector doesn't know anything about the contents or
> locations of our
> regions. Objects in our regions are permitted to reference
> objects in the GC
> heap, so for correctness the collector needs to scan all
> non-atomic objects in
> our regions during a collection.
> We currently accomplish this by adding the contents of our
> regions to the root
> set using the GC_push_other_roots callback, in which we push
> the entire
> contents of all our memory regions containing non-atomic
> objects using
> GC_push_all_stack. The problem is in some applications our
> regions can grow to
> be quite large (think GBs and thousands of objects), and
> GC_push_all_stack is
> not robust against mark stack overflow - so we sometimes get
> 'unexpected mark
> stack overflow' crashes in the GC_push_all_stack call.
> Is there a safe way to push large amounts of data from within
> GC_push_other_roots? I've combed the docs and sources and
> haven't found much
> other than this comment in mark_rts.c:
> if (GC_push_other_roots != 0) (*GC_push_other_roots)();
> /* In the threads case, this also pushes thread stacks. */
> /* Note that without interior pointer recognition lots */
> /* of stuff may have been pushed already, and this */
> /* should be careful about mark stack overflows. */
> but it doesn't provide any hint about how the callback should
> actually go
> about being "careful about mark stack overflows".
> Is there some analog to GC_push_all_stack I'm missing which
> can correctly
> handle mark stack overflow? Is there a better way to
> accomplish what I need? I considered trying to predict the
> possibility of overflow before calling
> GC_push_all_stack and calling GC_signal_mark_stack_overflow,
> but any heuristic
> that guesses wrong even once will crash the program.
> Similarly, tweaking
> INITIAL_MARK_STACK_SIZE to a larger value only postpones the
> crash until the
> first time we hit that size.
> The situation seems analogous to having a large amount of
> GC_malloc_uncollectable objects, but there are no mark bits
> or other block
> headers available on that uncollectable space so the same
> code is not directly
> applicable. Perhaps I should somehow plugin a callback in
> GC_push_next_marked_uncollectable and advance my own pointer
> through the
> regions? (although I'd presumably also need callbacks to
> reset the pointer on
> an overflow or next GC).
> Any hints or ideas are appreciated...
> PS- Is there a good reason why the GC_push_* functions cannot simply
> automatically grow and copy the mark stack on an overflow to
> handle overflows
> transparently? It seems like the collector currently goes to
> trouble to spread that overflow checking and growing logic
> all over the mark
> phase code, which seems a less robust design (so hopefully
> there's a good
> motivation for it?).
More information about the Gc