Re: [Gc]: Patch resubmission: Default stop_func
ivmai at mail.ru
Tue Sep 22 00:07:16 PDT 2009
"Boehm, Hans" <hans.boehm at hp.com> wrote:
> Ivan -
> I just got a chance to look at this, and I don't quite understand the logic.
The logic in brief is described in gc.h for GC_set_stop_func:
/* Set and get the default stop_func. The default stop_func is used by */
/* GC_gcollect() and by implicitly trigged collections (except for the */
/* case when handling out of memory). Must not be 0. */
That is, whenever we are collecting (for reasons other than in case of out of memory), a user-supplied stop_func is used instead of never_stop_func. Initially, this user-supplied (default) stop_func is never_stop_func (so, the logic remains exactly as before unless the client uses GC_set_stop_func).
The purpose of a stop_func itself is described in the gc.h comment for GC_try_to_collect(). (It might be good, also, to note there that stop_func is called with the lock held and that it must not manipulate pointers to GC heap.)
> GC_collect_or_expand() is called in cases in which we need more memory, but none is available in the existing heap. I think your patch changes the nonincremental case to collect with a timeout here.
Yes. (I've also changed some simple cases of incremental mode - the rest I've marked with FIXME).
> If it actually times out, you will end up growing the heap.
That's upon the client - he is warned by GC_set_stop_func and GC_try_to_collect comments (I think warned enough). The client might use a sophisticated logic (e.g, use GC_get_heap_size_inner()), while the GC logic for it remains simple - call stop_func regularly.
> The problem is that for a long running application with fixed space requirements, this will probably happen regularly, and I suspect the heap will grow without bounds, since every time it runs out, there is some probability it will grow the heap even if there would have been plenty of space after collection.
This shouldn't be a problem due to:
- there is a heap limit (which, again, could be preset by the client);
- in case of unmapping enabled, the heap might shrink down after several successful collections;
- the client could do GC_gcollect() (which uses client stop_func too) in yield loops (regularly or base on some heap size condition).
> This isn't critical, since none of this applies if you don't use the new functionality. But I wonder what the intent was, and if this is really useful as is. At least we should document the hazards.
The intent is for time-critical applications: suppose you need a time-guaranteed reaction for a external/async signal (eg., a user presses a button). With a client "default stop_func", it's possible to make the world resume nearly immediately (I mean in a some guaranteed max delay) after the event, and process the event within a guaranteed time period (unless, of course, the case when GC is unavoidable to prevent returning NULL).
You can't use GC_disable/enable_gc() instead of the proposed solution (because GC_dont_gc doesn't interrupt the collection, and because it prevents a collection even initiated by an out-of-memory event).
More over, with the "default stop_func" concept, it's possible to make an existing application have a time-guaranteed reaction (for external signals) without much code rewrote - just place GC_set_stop_func call at start-up (after GC_INIT).
Of course, I agree that playing with the "default stop_func" in an application needs its careful profiling.
Could anybody suggest a better solution (other than "default stop_func" concept)?
> I really don't think there's a safe way to use GC_try_to_collect that doesn't involve collecting before you're out of memory.
Why not, even if we set default stop to the corner-case "never collect" (returns 1), then the collections would happen only if the heap is full and the heap reaches it's limit - this should be safe (in real world this could be safe only if GC_MAXIMUM_HEAP_SIZE is set) but the performance would be really poor...
> > From: gc-bounces at napali.hpl.hp.com
> > [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Ivan Maidanski
> > Hi!
> > This suggested patch (ivmai129.diff), superseding diff47 [Nov
> > 20], introduces the default stop_func used on implicitly
> > trigged collections (unless handling OOM, of course). For
> > more info, see (the first 2 paragraphs):
> > http://permalink.gmane.org/gmane.comp.programming.garbage-coll
> > Note: the default stop_func is also used for GC_gcollect() -
> > this is a bit arguable but I think it's more logical
> > (explicit fn is used for GC_try_to_collect(), implicit
> > default stop_func is used for collections trigged by other
> > means) and more useful (eg., a timeout policy could be
> > introduce to an existing application by just adding
> > GC_set_stop_func(fn) instead of replacing all GC_gcollect()
> > calls with something like GC_try_to_collect(GC_get_stop_func())).
> > ...
More information about the Gc