[Gc] leak with finalizers

Boehm, Hans hans.boehm at hp.com
Fri Feb 25 10:21:43 PST 2005


> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com 
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Paolo Molaro
> Sent: Friday, February 25, 2005 7:03 AM
> To: gc at napali.hpl.hp.com
> Subject: Re: [Gc] leak with finalizers
> 
> 
> On 02/23/05 Hans Boehm wrote:
> > That suggests that a very large fraction of objects are either 
> > finalizable, or reachable from finalizable objects.  I've 
> never seen 
> > that outside of .NET. ...
> 
> Well, I'm not sure this has something to do with .net per se 
> and it's not that this happens regularly: most regular apps 
> have no such issue. Also note that there is no need for many 
> objects to be finalizable: it's just that a finalizable 
> object can point to big objects, as it happens with file 
> streams, for example (where big can ge a few KBs). The kind 
> of apps where the issues shows up are primarily server-side
> applications: we have been testing them with hundreds of 
> thousands of requests. I'm sure the allocation pattern may be 
> different, but it may also be that mono is pushing the limits 
> of the usage patterns of libgc, by providing a large user 
> base with a diversity of applications. We've been reasonably 
> happy so far, though we're trying more and more to limit the 
> amount of memory that is inspected conservatively because 
> memory retention seems to be an issue for long running 
> processes with heavy activity.
Do file streams actually dominate in these applications?  I guess
it makes sense that they might on occasion.  I'm a little surprised
that gcj didn't run into these issues.  But it may be that Java
always enough background short-lived allocation that this sort of
problem doesn't arise.  Perhaps C# structs are part of the
difference here?

It would be great to track down where memory retention is coming from.
The backtracing facility in the collector can be helpful, but I don't
know whether it works, or would be difficult to get to work with Mono.
(It kind of works with gcj, but requires a separate libgcj build.)
> 
> > As far as the patch is concerned, does your patch also work 
> for you if 
> > you check for a geometric increase in the number of finalizable 
> > objects, e.g. that it grew by at least 10%?  I would find that a 
> > little easier to justify than the current patch.  An 
> absolute number 
> > here doesn't seem very meaningful, since everything else grows with 
> > heap size.  In a huge heap, you'd be testing for a tiny growth.
> 
> I'm testing on a 32 bit system with heap sizes up to about 1 
> GB and the number seems adequate. Allowing, say, 10000 
> objects to queue up may not be best than doing a collection, 
> because we start hitting swap. Anyway, yes, more tests are 
> needed here especially with bigger heaps and 64 bit machines.
How about a fixed constant, plus a small percentage?  It seems still
seems strange to me to not have this increase with heap size at all ...
> 
> > A more robust fix might be to track the number of bytes marked from 
> > finalizable objects, and basically count those as though that many 
> > bytes were allocated.  If lots of bytes are marked this 
> way, you are 
> > either making progress in running finalizers, or (in other 
> > environments) you have long finalizer chains and need frequent 
> > collections.
> > 
> > But this seems hard if you want to account for pointerfree objects 
> > correctly.  I don't see how to do it without cloning GC_mark_from().
> 
> Yes, this would be the 'correct' thing to do, but even non 
> considering the code overhead, I don't know how much of a 
> runtime overhead this would have.
So long as we still mark sequentially from finalizers, I think the
overhead isn't too bad.  I would guess it might add 10 or 20% to this
part of the mark time, which should be small.  (In the parallel case,
this kind of counter would be messier.  You really need to keep
per-thread counters, or at least keep local counters temporarily,
to avoid cache conflicts.  That's also possible, but ...)

I added it to the to do list for GC7.x.

> The next version of 
> mono/CLR have an API call called something like 
> GC.AddMemoryPressure (int numbytes) or something like that. 
> This is supposed to be a rough amount of unmanaged memory 
> referenced by an object and it's supposed to be used with 
> objects that have finalizers. Maybe we can hook into this to 
> provide the GC with an estimate of how much memory could be 
> freed by running finalizers.
Could this be used to track this per object?  Otherwise it seems like
too
coarse a tool.

The collector already has kind of the opposite hook of what you want.
"GC_non_gc_bytes" tracks memory in the GC heap that
will be explicitly deallocated, and hence should be excluded from
GC-triggering calculations.  This is used by GC_malloc_uncollectable
and friends.  (The interface should really use a setter function,
with proper locking.)
> 
> > There are some other issues with this scenario.  In particular, the 
> > marking from finalizable objects is currently always done 
> > sequentially, on the assumption that it doesn't take much 
> time.  Once 
> > we all have dual core or larger processor chips, having most of the 
> > marking done here isn't going to be good either.
> 
> Well, marking from finalizable objects needs to be done 
> anyway and this usage case is more about having big buffers 
> referenced by finalizable objects than having many of them. Thanks.
> 
That's encouraging.  It sounds like we will not see cases in which
both marking takes a substantial amount of time, and the majority of
marking is done from finalizers?

Hans


More information about the Gc mailing list