[Gc] moving from 6.8->7.0 getting smashed objects

Hans Boehm Hans.Boehm at hp.com
Sat Aug 4 15:37:41 PDT 2007


Any progress on this?  Has anyone else seen similar issues?

I checked some fairly significant patches into the CVS tree yesterday,
including:

A bug fix for printing smashed objects.  I think this was actually a minor 
bug that affected only the quality of the output.

GC_malloc and GC_malloc_atomic should now force initialization, even
with thread local allocation.  Thus the default should now behave
similarly to 6.8.

Some improvements for REDIRECT_MALLOC.  It's now easier to build
a Linux shared library that redefines malloc and can be LD_PRELOADed
for use with previously linked applications.  This is stil imperfect.
Things like Mozilla and abiword don't work across all distributions.
(In some cases, I suspect this requires some fairly ugly hacks to change,
since they may squirrel away pointers in strange places.)

Hans

On Mon, 23 Jul 2007, jim marshall wrote:

>
>
> Boehm, Hans wrote:
>> This is the only thread running at this point?  Is there another thread
>> that could be running in the interim and corrupting the heap?
>> 
> At the point this happens we only have one thread executing.
>> This is built with DBG_HDRS_ALL?
>> 
> I'll have to check, I suspect not though
>> It looks to me like the GC_check_heap_block message should be
>> impossible.  It should print the start of the user-visible object. I
>> don't think an address ending in 000 is likely to qualify.  (That might
>> also explain why you never see it being returned.)  Looking at the code,
>> I also no longer believe this is being printed correctly.  I suspect
>> that this is not the main problem, but I will see if I can generate a
>> test case to reproduce this, and investigate.  Hopefully a patch is
>> forthcoming ...
>> 
>> I believe you are reading way too much into the precise point at which
>> the message appears.  The heap is scanned for overwrite errors during
>> each GC.  It is inconvenient to print anything there, so that's
>> postponed until the next GC_print_all_errors call, which happens during
>> some, but not all, allocations.
>> 
> OK - I presumed it checked during each allocation.
>> You probably want to set a breakpoint in GC_check_heap_proc(), and look
>> at GC_n_smashed afterwards.  The GC_smashed[] array should contain
>> pointers to the locations that the GC thought were clobbered.  I suspect
>> it's safe to invoke GC_check_heap_proc from a debugger to narrow down
>> the point at which things go south.
>> 
> I will take a look at this.
>
> Thanks
>> Hans
>>
>> 
>>> -----Original Message-----
>>> From: jim marshall [mailto:jim.marshall at wbemsolutions.com] Sent: Thursday, 
>>> July 19, 2007 10:15 PM
>>> To: Boehm, Hans
>>> Cc: gc at napali.hpl.hp.com
>>> Subject: Re: [Gc] moving from 6.8->7.0 getting smashed objects
>>> 
>>> Boehm, Hans wrote:
>>> 
>>>> I think the algorithm for detecting smashed objects has not changed.
>>>> Lots of other things, including object placement, no doubt 
>>> have.  It's 
>>>> quite conceivable that either:
>>>> 
>>>> 1) Other changes cause the overwrite to be noticed in 7.0 
>>> but not 6.8
>>> 
>>>> 2) There is another bug in 7.0
>>>> 
>>>> I would hope that it wouldn't take that long to debug this from the 
>>>> smashed object messages?  It should be fairly easy to 
>>> determine where 
>>>> objects in the vicinity were allocated.  If the problem is 
>>> repeatable 
>>>> enough, a watchpoint on the overwritten location might even work.
>>>> 
>>> I've not made much headway with this, perhaps someone could point me in 
>>> the direction to go.
>>> 
>>> Basically what I have found is that at some point our program calls 
>>> GC_MALLOC (gc_debug_malloc - as we are using a debug build). This call 
>>> returns successfully (no smashed objects). However the very next 
>>> allocation causes the GC to spit out the smashed object warning. Here is a 
>>> GDB session to give an example:
>>> 
>>> wsi_malloc (pSize=78) at src/wsimemory.c:47
>>> 47          void *mem = GC_MALLOC(pSize);
>>> (gdb) s
>>> GC_debug_malloc (lb=78, s=0x400535c9 "src/wsimemory.c", i=47) at 
>>> dbg_mlc.c:457
>>> 457     {
>>> (gdb) n
>>> 458         void * result = GC_malloc(lb + DEBUG_BYTES);
>>> (gdb) 460         if (result == 0) {
>>> (gdb) 458         void * result = GC_malloc(lb + DEBUG_BYTES);
>>> (gdb) 460         if (result == 0) {
>>> (gdb) 467         if (!GC_debugging_started) {
>>> (gdb) 471         return (GC_store_debug_info(result, (word)lb, s, 
>>> (word)i));
>>> (gdb) print result
>>> $1 = (void *) 0x809df70
>>> (gdb) n
>>> 472     }
>>> (gdb)
>>> wsi_malloc (pSize=78) at src/wsimemory.c:50
>>> 50          memset(mem, 0, pSize);
>>> (gdb) print mem
>>> $2 = (void *) 0x809df80
>>> (gdb) call GC_debug_malloc(64, "myfile.c", 1)
>>> GC_check_heap_block: found smashed heap objects:
>>> 0x80d1008 in object at 0x80d1000(<smashed>, appr. sz = 197)
>>> $3 = (void *) 0x80a9f88
>>> (gdb) 
>>> 
>>> You can see above that we call GC_debug_malloc, this returns. Now before I 
>>> executed line 50 (the memset call) I had GDB 'call GC_debug_malloc' 
>>> directly and it detects the smashed object.
>>> 
>>> You can see the smashed object is at 0x80d1000 which is consistent, but I 
>>> can not find a place where this address is returned to my application (I 
>>> set breaks in all the GC allocate functions I could and nothing returned 
>>> seemed to be in that range). As an aside to the previous sentence, our 
>>> application makes a lot of allocation calls, so while I set the break 
>>> points, I didn't enable them for a while. I tried setting conditional 
>>> breaks in GC_debug_malloc (and others) for when result==0x80d1000, but 
>>> they never got hit.
>>> 
>>> Setting a watch point on those memory locations cause the program to crawl 
>>> (or hang - I couldn't determine). My test machine is a Celeron and GDB 
>>> does not use a hardware watch when I use the watch command on the memory 
>>> address.
>>> 
>>> Any ideas on what else I can do to help determine if this is in our app 
>>> (which is my suspicion) or an issue with GC 7.0?
>>> 
>>> Thanks!
>>> -Jim
>>>
>>> 
>> 
>> _______________________________________________
>> Gc mailing list
>> Gc at linux.hpl.hp.com
>> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>> 
>> 
>>
>> 
>
> -- 
> Jim Marshall
> Sr. Staff Engineer
> WBEM Solutions, Inc.
> 978-947-3607
>
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>


More information about the Gc mailing list