[Gc] Re: Understanding why GC'ing increases to the double in time?
hans.boehm at hp.com
Fri Mar 10 13:40:04 PST 2006
It may well be that your operation A allocates and drops a moderately
large strongly-connected data structure, and operation B stores some
large integers in a place in which they are conservatively scanned, and
appear to point to the first data structure. There are probably other
possible scenarios along these lines, and some of them probably involve
real pointers, instead of misidentified ones.
Thus I'm not sure whether or not this is easily fixable. Clearing some
of the pointers in the data structure that is dropped might severely
limit the damage, if you can figure out what it is and control the code.
If the offending code is in libgcj, it would be good to clear pointers
there, it that's easily possible. In any case, it would be good to
identify the data structure that's being retained.
> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Martin
> Egholm Nielsen
> Sent: Friday, March 10, 2006 1:14 PM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] Re: Understanding why GC'ing increases to the
> double in time?
> > 1) You seem to be running with a block size of 1K instead of the
> > normal 4K. That's probably good. But if that's a
> consequence of the
> > collector having been configured with SMALL_CONFIG, it
> might be worth revisiting.
> > (It looks like that's not the case, but ...)
> Actually it was configured with SMALL_CONFIG - but that was
> one of my attempts to try something :-) The situation is
> quite similar without the SMALL_CONFIG, though, and I'm back
> in non-small again...
> > 2) "long" log:
> > 1024056 (atomic) + 1430164 (composite) collectable bytes in use
> > "short" log:
> > 1007932 (atomic) + 806484 (composite) collectable bytes in use
> > There are about 640K more, or almost twice as many, composite
> > (pointer-containing) objects in the case of the longer
> pause times.
> > I'm surprised that this makes that much of a difference,
> but it may,
> > depending on hardware characteristics, and on the
> pointer-density of
> > those extra objects.
> Aha! That is actually a very good discovery! It is indeed
> true that it occurs every time this is increased from the
> 800k to the 1400k.
> > The "atomic" (pointerfree) objects are not scanned. And
> setting the
> > mark bits is likely to be relatively cheap, since they tend to be
> > fairly big. The collector does have to scan 3MB of roots, which I
> > would have expected to cost more. But most of those are
> not going to
> > be pointers, so each word just takes a load and a
> comparison or two.
> > And the loads read memory sequentially, which is often a
> much better
> > case, than tracing pointer-containing objects in the heap.
> But what is more interesting is the discoveries I made today.
> I didn't put in more statistic into the GC - I rather went
> into capturing when this composite-size increased that severely.
> I didn't quite find the time to finish the work today, but
> I'm now, more than ever, sure that there is a bug somewhere.
> However, I'm not sure whether it's in my application (don't
> think so, though :-) ), in the GC, or in libgcj! I rather
> suspect the last, and then again ;-)
> In short, today I found a two (high-level) "operations" I
> should perform in order to provoke the the extended
> composite-size. The funny thing is that the order of the
> operations _didn't_ matter - that is, in the "steady state"
> of my application, operation A alone didn't trigger it, and
> similar operation B alone didn't neither. However, invoking
> the other operation after the first one did make it increase
> - just like that!
> The reason for mentioning the "trigger-symmetry"
> (interchangebility) is that though operation A is quite
> extensive, operation B is rather simple, and it's quite
> interesting that this simple operation is able to trigger the
> severe increase in "composite-size".
> Moreover, I find it extremely peculiar that the increase is
> almost a doubling.
> What is left to investigate is whether the increase will ever
> ever happen if one of the operations are omitted! That is
> quite easy for me to try though...
> In more details, the simple operation (B) involves nothing
> more than doing some extremely simple XML-parsing followed by
> some Java(GCJ)-reflection to store the read data.
> So, I suspect the problem being caused by:
> 1) the XML-parsing routine (however, done many times before entering
> 2) use of the Reflection-API (and whatever that involves of
> 3) the GC being confused :-)
> > If you really want to track this down further, you might be able to
> > add more detailed statistics to GC_print_block_descr (called by
> > GC_dump()) to get a better handle on what the extra objects
> look like.
> > It wasn't obvious to me from the heap dumps you included.
> Scanning the GC_print_block_descr method didn't quite made me
> come up with ideas getting any futher statistics. So if you
> think this is relevant, can you please give me some hints?
> Thanks for helping - we're (:-)) closing in...
> Martin Egholm
> Gc mailing list
> Gc at linux.hpl.hp.com
More information about the Gc