Re[5]: [Gc]: Patch resubmission: Default stop_func

Ivan Maidanski ivmai at mail.ru
Fri Oct 9 11:13:07 PDT 2009


Hi!

Two weeks ago I wrote:
> "Boehm, Hans" <hans.boehm at hp.com> wrote:
> > > From:  Ivan Maidanski
> > > Sent: Tuesday, September 22, 2009 12:07 AM
> > > To: gc at napali.hpl.hp.com
> > > Subject: Re[2]: [Gc]: Patch resubmission: Default stop_func
> > >
> > > Hi!
> > >
> > > "Boehm, Hans" <hans.boehm at hp.com> wrote:
> > > > Ivan -
> > > >
> > > > I just got a chance to look at this, and I don't quite
> > > understand the logic.
> > >
> > > The logic in brief is described in gc.h for GC_set_stop_func:
> > > /* Set and get the default stop_func.  The default stop_func
> > > is used by */
> > > /* GC_gcollect() and by implicitly triggered collections
> > > (except for the  */
> > > /* case when handling out of memory).  Must not be 0.
> > >           */
> > >
> > > That is, whenever we are collecting (for reasons other than
> > > in case of out of memory), a user-supplied stop_func is used
> > > instead of never_stop_func. Initially, this user-supplied
> > > (default) stop_func is never_stop_func (so, the logic remains
> > > exactly as before unless the client uses GC_set_stop_func).
> > Just to be clear, if I read the code correctly, it does seem to be used when an allocation request cannot be satisfied with memory already in the heap because the heap is full of garbage.  In that case GC_should_collect() would be true, and thus it would try to collect.  It would use the timeout unless GC_bytes_allocd == 0, i.e. unless the last collection was unsuccessful.
>
> (timeout is not the "right" word for this context as timeout is used in GC_timeout_stop_func(), and the purpose of a custom "default" stop_func by design to abort the collection in case of some async signal requiring time-guaranteed reaction.)
>
> As far as I understood your statement, yes. As it's said GC_default_stop_func is used for implicitly-initiated collection, and the only place of it (not taking into account incremental ones) is in GC_collect_or_expand() (which is a bit tricky in the patch).
>
>
>
> >
> > Thus for a steady state application that's running in its proper heap size, you may still get a timeout here, the GC may fail, and you may increase the heap size further.
>
> I see your point: you are asking why we trying to increase the heap if the collection has been aborted (instead if trying to satisfy the query with the current heap size), right? I agree that the logic here could be more sophisticated but:
> - it would require other changes, otherwise the collector calls GC_collect_or_expand() on the next GC_malloc invocation;
> - growing the heap is a more time-linear/predictable process than collection (unless we run of RAM).
>
> I agree that the current logic might result in growing up to the limit but I don't see how this could be easily improved right now.
>
> > >
> > > The purpose of a stop_func itself is described in the gc.h
> > > comment for GC_try_to_collect(). (It might be good, also, to
> > > note there that stop_func is called with the lock held and
> > > that it must not manipulate pointers to GC heap.)
> > Agreed.
> >
> > >
> > > >
> > > > GC_collect_or_expand() is called in cases in which we need
> > > more memory, but none is available in the existing heap.  I
> > > think your patch changes the nonincremental case to collect
> > > with a timeout here.
> > >
> > > Yes. (I've also changed some simple cases of incremental mode
> > > - the rest I've marked with FIXME).
> > >
> > > >  If it actually times out, you will end up growing the heap.
> > >
> > > That's upon the client - he is warned by GC_set_stop_func and
> > > GC_try_to_collect comments (I think warned enough). The
> > > client might use a sophisticated logic (e.g, use
> > > GC_get_heap_size_inner()), while the GC logic for it remains
> > > simple - call stop_func regularly.
> > >
> > > >  The problem is that for a long running application with
> > > fixed space requirements, this will probably happen
> > > regularly, and I suspect the heap will grow without bounds,
> > > since every time it runs out, there is some probability it
> > > will grow the heap even if there would have been plenty of
> > > space after collection.
> > >
> > > This shouldn't be a problem due to:
> > > - there is a heap limit (which, again, could be preset by the client);
> > Yes, but I also don't think it helps here.  You'll just collect and fail before you do the real collection.
>
> Yes, that's up to the client decision whether the current collection should be immediately interrupted.
>
> > > - in case of unmapping enabled, the heap might shrink down
> > > after several successful collections;
> > Virtual memory doesn't currently shrink.  Thus we're still exhausting a resource, though one that's usually more plentiful.  It may not be as plentiful as we would like on 32-bit platforms.
>
> Agreed.
>
> > > - the client could do GC_gcollect() (which uses client
> > > stop_func too) in yield loops (regularly or base on some heap
> > > size condition).
> > I think I would rather have the client do explicit GC_try_to_collect calls occasionally (hopeful after checking that enough allocation has occurred), and have the collector run to completion when triggered internally.
>
> I agree but what if an async event (requiring immediate and time-guaranteed processing) comes when the collector triggered the collection internally (and we still have enough memory to satisfy the request)?
>
> > >
> > > >
> > > > This isn't critical, since none of this applies if you
> > > don't use the new functionality.  But I wonder what the
> > > intent was, and if this is really useful as is.  At least we
> > > should document the hazards.
> > >
> > > The intent is for time-critical applications: suppose you
> > > need a time-guaranteed reaction for a external/async signal
> > > (eg., a user presses a button). With a client "default
> > > stop_func", it's possible to make the world resume nearly
> > > immediately (I mean in a some guaranteed max delay) after the
> > > event, and process the event within a guaranteed time period
> > > (unless, of course, the case when GC is unavoidable to
> > > prevent returning NULL).
> > >
> > > You can't use GC_disable/enable_gc() instead of the proposed
> > > solution (because GC_dont_gc doesn't interrupt the
> > > collection, and because it prevents a collection even
> > > initiated by an out-of-memory event).
> > >
> > > More over, with the "default stop_func" concept, it's
> > > possible to make an existing application have a
> > > time-guaranteed reaction (for external signals) without much
> > > code rewrote - just place GC_set_stop_func call at start-up
> > > (after GC_INIT).
> > >
> > > Of course, I agree that playing with the "default stop_func"
> > > in an application needs its careful profiling.
> > >
> > > Could anybody suggest a better solution (other than "default
> > > stop_func" concept)?
> > >
> > > >
> > > > I really don't think there's a safe way to use
> > > GC_try_to_collect that doesn't involve collecting before
> > > you're out of memory.
> > >
> > > Why not, even if we set default stop to the corner-case
> > > "never collect" (returns 1), then the collections would
> > > happen only if the heap is full and the heap reaches it's
> > > limit - this should be safe (in real world this could be safe
> > > only if GC_MAXIMUM_HEAP_SIZE is set) but the performance
> > > would be really poor...
> > I think it would require a maximum heap size to be set, as you say, and then I think you may still not collect soon enough that you could avoid running a full collection even when told to stop.
>
> Agree.
>
> >
> > A better alternative may be to try to start a collection preemptively (maybe in GC_allocobj and GC_alloc_large) when you notice that you may be running low on memory, but you still have blocks left to satisfy the allocation request.  That way if you "time out", you have a chance to try again the next time, without running out of heap or growing it.  Since that test probably isn't free, and this seems fairly special purpose, I would probably put it under yet another compile-time flag.
>
> Agree. (that's a to-do for the next release).
>
> >
> > The alternative is to require the client to do this itself, based GC_get_bytes_since_gc() and GC_get_heap_size() or the like.  That's essentially the current solution.
> >
> > Hans

I've improved the logic slightly for the 'default' stop_func (in GC_collect_or_expand):
- bool 'retry' parameter is passed to GC_collect_or_expand() which is false only for the first iteration in GC_allocobj() and GC_alloc_large();
- if GC_dont_expand and retry are both set then use never_stop_func (instead of a custom func);
- if the collection was aborted (by a custom stop_func) and this is not a retry then just return (true) instead of expanding the heap.

Bye.


More information about the Gc mailing list