[Gc] leak with finalizers

Hans Boehm Hans.Boehm at hp.com
Wed Feb 23 20:55:38 PST 2005


Paolo -

I'm very surprised that this shows up regularly in .NET applications.
That suggests that a very large fraction of objects are either
finalizable, or reachable from finalizable objects.  I've never seen
that outside of .NET.  But I've heard similar things from others about
.NET applications.  I'd love to understand the discreptancy.

As far as the patch is concerned, does your patch also work for you
if you check for a geometric increase in the number of finalizable
objects, e.g. that it grew by at least 10%?  I would find that a little
easier to justify than the current patch.  An absolute number here
doesn't seem very meaningful, since everything else grows with heap
size.  In a huge heap, you'd be testing for a tiny growth.
(I'd also perform that test last, since GC_should_collect() should
normally be more likely to succeed.)  I'd be OK with taking such
a modified patch, since it seems like a clear improvement, even if
it's not ideal.

A more robust fix might be to track the number of bytes marked from
finalizable objects, and basically count those as though that many bytes
were allocated.  If lots of bytes are marked this way, you are
either making progress in running finalizers, or (in other environments)
you have long finalizer chains and need frequent collections.

But this seems hard if you want to account for pointerfree objects
correctly.  I don't see how to do it without cloning GC_mark_from().
(That might actually be a reasonable thing to do, since on X86 machines
you may want multiple clones for AMD vs. Intel anyway, and once you
have the infrastructure for two clones ...)

GC7.0 counts the number of set marks in each block, so you could avoid
the problem there, if you were willing to make an extra pass over all the
block headers.

There are some other issues with this scenario.  In particular, the marking
from finalizable objects is currently always done sequentially, on the
assumption that it doesn't take much time.  Once we all have dual core
or larger processor chips, having most of the marking done here
isn't going to be good either.

Hans

On Sun, 20 Feb 2005, Paolo Molaro wrote:

> On 12/24/04 Paolo Molaro wrote:
> > Yep, confirmed. Sadly the C# example has a slightly different patterns and
> > eventually it will go out of mem anyway. I'll try to reproduce with a C
> > program. Adding a GC_gcollect() at the right place in the C# code makes it
> > work fine, too, so it seems to be very sensitive to the time a collection
> > happens...
>
> In the last few weeks I made changes to the mono runtime so that more
> objects are allocated as pointer-free. I also reduced the amount
> of runtime data structures that use libgc for allocation, ensuring
> that only the minimum amount of memory is allocated with libgc.
> This fixed the issue with the C# app not behaving correctly when
> setting the max heap size with the test case.
>
> > > I wouldn't be surprised if other implementations had similar problems
> > > with such code.
> >
> > The C# equivalente has been reported to work fine with the MS CLR.
> >
> > > The workaround is to avoid finalization of objects that reference
> > > huge amounts of memory.  If possible, finalize a smaller object that
> > > is referenced by the main object instead.
> >
> > Unfortunately we can't ask our users to rewrite their code:-)
>
> We have been testing the attached patch for the last few days.
> It attempts to avoid increasing the heap size if a large amount
> of finalizable objects were allocated recently and if the last
> finalization runs had freed some memory. This should ensure that
> forward progress is made (though I guess it may happen that
> a useless GC is done in some cases, before deciding it really
> needs to expand).
> The patch fixes the test case completely (no need to set the max
> heap size) and many other programs running in mono behave much
> better now: programs that used to increase the heap size until
> they got to a crawl now have steady memory usage.
> The 500 constant (number of objects with finalizers) is a guess
> that happens to balance between memory usage and performance: I
> guess we could make it configurable, so it's decreased when
> the runtime knows that many finalizers are created that point to
> potentially big arrays or objects. I haven't tried this kind
> of runtime tweaking.
> Comments welcome.
>
> lupus
>
> --
> -----------------------------------------------------------------
> lupus at debian.org                                     debian/rules
> lupus at ximian.com                             Monkeys do it better
>


More information about the Gc mailing list