[Gc] Re: Understanding why GC'ing increases to the double in time?
Martin Egholm Nielsen
martin at egholm-nielsen.dk
Sun Mar 12 12:06:10 PST 2006
> It may well be that your operation A allocates and drops a moderately
> large strongly-connected data structure, and operation B stores some
> large integers in a place in which they are conservatively scanned, and
> appear to point to the first data structure. There are probably other
> possible scenarios along these lines, and some of them probably involve
> real pointers, instead of misidentified ones.
I'm not sure I fully understand this. Should I read "appear" as "this is
an unfortunate feature of GC"?
Moreover, say the above is indeed the situation, does that explain why
it also happens when interchanged - that is, operation B before operation A?
> Thus I'm not sure whether or not this is easily fixable. Clearing some
> of the pointers in the data structure that is dropped might severely
> limit the damage, if you can figure out what it is and control the code.
Most likely possible - if I'm somehow able to identify what those are.
> If the offending code is in libgcj, it would be good to clear pointers
> there, it that's easily possible. In any case, it would be good to
> identify the data structure that's being retained.
Yep... Guidelines to identify those are most gracefully accepted?!
But yes, I'm quite sure it's in gcj - at least as it appears today
(since Jessie [HTTPS layer] is now [or soon] an integrated part of
Classpath/GCJ). And operation B is mostly a part of libgcj, as well...
>>From: gc-bounces at napali.hpl.hp.com
>>[mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Martin
>>Sent: Friday, March 10, 2006 1:14 PM
>>To: gc at napali.hpl.hp.com
>>Subject: [Gc] Re: Understanding why GC'ing increases to the
>>double in time?
>>>1) You seem to be running with a block size of 1K instead of the
>>>normal 4K. That's probably good. But if that's a
>>consequence of the
>>>collector having been configured with SMALL_CONFIG, it
>>might be worth revisiting.
>>>(It looks like that's not the case, but ...)
>>Actually it was configured with SMALL_CONFIG - but that was
>>one of my attempts to try something :-) The situation is
>>quite similar without the SMALL_CONFIG, though, and I'm back
>>in non-small again...
>>>2) "long" log:
>>>1024056 (atomic) + 1430164 (composite) collectable bytes in use
>>>1007932 (atomic) + 806484 (composite) collectable bytes in use
>>>There are about 640K more, or almost twice as many, composite
>>>(pointer-containing) objects in the case of the longer
>>>I'm surprised that this makes that much of a difference,
>>but it may,
>>>depending on hardware characteristics, and on the
>>>those extra objects.
>>Aha! That is actually a very good discovery! It is indeed
>>true that it occurs every time this is increased from the
>>800k to the 1400k.
>>>The "atomic" (pointerfree) objects are not scanned. And
>>>mark bits is likely to be relatively cheap, since they tend to be
>>>fairly big. The collector does have to scan 3MB of roots, which I
>>>would have expected to cost more. But most of those are
>>not going to
>>>be pointers, so each word just takes a load and a
>>comparison or two.
>>>And the loads read memory sequentially, which is often a
>>>case, than tracing pointer-containing objects in the heap.
>>But what is more interesting is the discoveries I made today.
>>I didn't put in more statistic into the GC - I rather went
>>into capturing when this composite-size increased that severely.
>>I didn't quite find the time to finish the work today, but
>>I'm now, more than ever, sure that there is a bug somewhere.
>>However, I'm not sure whether it's in my application (don't
>>think so, though :-) ), in the GC, or in libgcj! I rather
>>suspect the last, and then again ;-)
>>In short, today I found a two (high-level) "operations" I
>>should perform in order to provoke the the extended
>>composite-size. The funny thing is that the order of the
>>operations _didn't_ matter - that is, in the "steady state"
>>of my application, operation A alone didn't trigger it, and
>>similar operation B alone didn't neither. However, invoking
>>the other operation after the first one did make it increase
>>- just like that!
>>The reason for mentioning the "trigger-symmetry"
>>(interchangebility) is that though operation A is quite
>>extensive, operation B is rather simple, and it's quite
>>interesting that this simple operation is able to trigger the
>>severe increase in "composite-size".
>>Moreover, I find it extremely peculiar that the increase is
>>almost a doubling.
>>What is left to investigate is whether the increase will ever
>>ever happen if one of the operations are omitted! That is
>>quite easy for me to try though...
>>In more details, the simple operation (B) involves nothing
>>more than doing some extremely simple XML-parsing followed by
>>some Java(GCJ)-reflection to store the read data.
>>So, I suspect the problem being caused by:
>>1) the XML-parsing routine (however, done many times before entering
>>2) use of the Reflection-API (and whatever that involves of
>>3) the GC being confused :-)
>>>If you really want to track this down further, you might be able to
>>>add more detailed statistics to GC_print_block_descr (called by
>>>GC_dump()) to get a better handle on what the extra objects
>>>It wasn't obvious to me from the heap dumps you included.
>>Scanning the GC_print_block_descr method didn't quite made me
>>come up with ideas getting any futher statistics. So if you
>>think this is relevant, can you please give me some hints?
>>Thanks for helping - we're (:-)) closing in...
>> Martin Egholm
>>Gc mailing list
>>Gc at linux.hpl.hp.com
More information about the Gc