[Gc] Segfault in GC_mark_from in libgc 7.1 (released tarball)

Boehm, Hans hans.boehm at hp.com
Wed Aug 20 13:27:04 PDT 2008


Thank you!

That indeed does seem to be a long-standing bug.  There are paths into GC_push_marked in which h is an unallocated block.  Unfortunately, that doesn't seem to happen much with the standard tests on my machine.  It probably requires some amount of uncollectable allocation incremental collection, and/or mark stack overflows to trigger the bug.

I'll check in the assertion, since that doesn't cost anything with default builds, and might catch such things earlier in the future.

Can you try the attached patch to mark.c?  (This is a patch against a slight variant of the CVS trunk.  It may need minor tweaking for 7.1.)
I've also pasted it here, in case the attachment doesn't make it:

Index: mark.c
===================================================================
RCS file: /cvsroot/bdwgc/bdwgc/mark.c,v
retrieving revision 1.7
diff -u -r1.7 mark.c
--- mark.c      26 Jul 2008 00:51:33 -0000      1.7
+++ mark.c      20 Aug 2008 20:16:17 -0000
@@ -1802,7 +1802,7 @@
 {
     hdr * hhdr = HDR(h);

-    if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr), FALSE)) {
+    if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr) || HBLK_IS_FREE(hhdr), FALSE)) {
       h = GC_next_used_block(h);
       if (h == 0) return(0);
       hhdr = GC_find_header((ptr_t)h);
@@ -1819,7 +1819,8 @@

     if (!GC_dirty_maintained) { ABORT("dirty bits not set up"); }
     for (;;) {
-       if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr), FALSE)) {
+       if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr)
+                  || HBLK_IS_FREE(hhdr), FALSE)) {
           h = GC_next_used_block(h);
           if (h == 0) return(0);
           hhdr = GC_find_header((ptr_t)h);
@@ -1850,7 +1851,8 @@
     hdr * hhdr = HDR(h);

     for (;;) {
-       if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr), FALSE)) {
+       if (EXPECT(IS_FORWARDING_ADDR_OR_NIL(hhdr)
+                  || HBLK_IS_FREE(hhdr), FALSE)) {
           h = GC_next_used_block(h);
           if (h == 0) return(0);
           hhdr = GC_find_header((ptr_t)h);

Hans

> -----Original Message-----
> From: ktreichel at web.de [mailto:ktreichel at web.de]
> Sent: Wednesday, August 20, 2008 12:10 AM
> To: Boehm, Hans
> Cc: Bruce Hoult; gc at napali.hpl.hp.com
> Subject: RE: [Gc] Segfault in GC_mark_from in libgc 7.1
> (released tarball)
>
> Am Mittwoch, den 20.08.2008, 00:46 +0000 schrieb Boehm, Hans:
> > Unfortunately:
> >
> > a) This is all consistent with the hypothesis that somehow
> a pointer to 0xb50000 with a size of 131088 was pushed on the
> mark stack, i.e. the collector somehow failed to notice that
> this is a free heap block.  In fact the end of the block on
> the mark stack, minus the 131088 hb_descr (object size) is
> exactly 0xb50000, which I have a hard time dismissing as a
> coincidence.
> >
> > b) After staring at the code for a while, I'm increasingly
> convinced that this should be impossible.
> >
> > Could you try putting an assertion into PUSH_OBJ like so
> >
> > Index: gc_pmark.h
> > ===================================================================
> > RCS file: /cvsroot/bdwgc/bdwgc/include/private/gc_pmark.h,v
> > retrieving revision 1.5
> > diff -u -r1.5 gc_pmark.h
> > --- gc_pmark.h  26 Jul 2008 00:51:35 -0000      1.5
> > +++ gc_pmark.h  20 Aug 2008 00:25:14 -0000
> > @@ -138,6 +138,7 @@
> >  { \
> >      register word _descr = (hhdr) -> hb_descr; \
> >          \
> > +    GC_ASSERT(!HBLK_IS_FREE(hhdr)); \
> >      if (_descr != 0) { \
> >          mark_stack_top++; \
> >          if (mark_stack_top >= mark_stack_limit) { \
> >
> > and rebuilding with assertions enabled?
> >
> > I'd really like to know how this entry gets pushed on the
> mark stack.
>
> This is where the assertion aborts the program the first time.
>
> #2  0x00000000004c79d9 in GC_push_marked (h=<value optimized out>,
>     hhdr=0x79b270) at mark.c:1775
> 1775                 PUSH_OBJ(p, hhdr, GC_mark_stack_top_reg,
> mark_stack_limit);
>
> (gdb) print hhdr
> $9 = (hdr *) 0x79b270
>
> (gdb) print *hhdr
> $2 = {hb_next = 0xb32000, hb_prev = 0x0, hb_block = 0x7b5000,
>   hb_obj_kind = 2 '\002', hb_flags = 4 '\004', hb_last_reclaimed = 1,
>   hb_sz = 159744, hb_descr = 131088, hb_large_block = 1 '\001',
>   hb_map = 0x79b550, hb_n_marks = 1, hb_marks = {1, 0, 0, 0, 1}}
>
> (gdb) print *GC_mark_stack_top_reg
> $6 = {mse_start = 0x7b4f00 "", mse_descr = 240}
>
>
> ***Heap sections:
> Total heap size: 958464
> Section 0 from 0x7b0000 to 0x7c0000 0/16 blacklisted Section
> 1 from 0x7c0000 to 0x7e6000 0/38 blacklisted Section 2 from
> 0x947000 to 0x95a000 0/19 blacklisted Section 3 from 0x95a000
> to 0x973000 0/25 blacklisted Section 4 from 0x983000 to
> 0x9a4000 0/33 blacklisted Section 5 from 0xaf1000 to 0xb1d000
> 0/44 blacklisted Section 6 from 0xb1d000 to 0xb58000 0/59 blacklisted
>
> ***Free blocks:
> Free list 1 (Total size 36864):
>         0x947000 size 4096 not black listed
>         0x961000 size 4096 not black listed
>         0x964000 size 4096 not black listed
>         0x967000 size 4096 not black listed
>         0x96b000 size 4096 not black listed
>         0x96f000 size 4096 not black listed
>         0x983000 size 4096 not black listed
>         0xaf4000 size 4096 not black listed
>         0xaf6000 size 4096 not black listed Free list 2
> (Total size 8192):
>         0xaf1000 size 8192 not black listed Free list 4
> (Total size 16384):
>         0x9a0000 size 16384 not black listed Free list 5
> (Total size 20480):
>         0xb1b000 size 20480 not black listed Free list 7
> (Total size 28672):
>         0xb13000 size 28672 not black listed Free list 32
> (Total size 315392):
>         0x7b5000 size 159744 not black listed
>         0xb32000 size 155648 not black listed Total of 425984
> bytes on free list
>
>
> > Is your application using some of the more unusual
> interfaces like "typed" allocation?
> >
>
> Yes, we are using typed allocation.
>
> Klaus
>
> > Thanks.
> >
> > Hans
> >
> > > -----Original Message-----
> > > From: ktreichel at web.de [mailto:ktreichel at web.de]
> > > > > ...
> > > > > ***Heap sections:
> > > > > Total heap size: 958464
> > > > > Section 0 from 0x7ad000 to 0x7bd000 0/16 blacklisted Section
> > > > > 1 from 0x7bd000 to 0x7e3000 0/38 blacklisted Section 2
> > > from 0x944000
> > > > > to 0x957000 0/19 blacklisted Section 3 from 0x957000
> to 0x970000
> > > > > 0/25 blacklisted Section 4 from 0x980000 to 0x9a1000 0/33
> > > > > blacklisted Section 5 from 0xaee000 to 0xb1a000
> > > > > 0/44 blacklisted Section 6 from 0xb1a000 to 0xb55000 0/59
> > > > > blacklisted
> > > > >
> > > > > (gdb) print *GC_find_header(0xb50000)
> > > > > $8 = {hb_next = 0xb18000, hb_prev = 0x0, hb_block = 0x7b2000,
> > > > >   hb_obj_kind = 2 '\002', hb_flags = 4 '\004',
> > > hb_last_reclaimed = 1,
> > > > >   hb_sz = 20480, hb_descr = 131088, hb_large_block = 1 '\001',
> > > > >   hb_map = 0x798550, hb_n_marks = 1, hb_marks = {1,
> 0, 0, 0, 1}}
> > > > >
> > > > > For locations >= 0xb51000 GC_find_header returns 0.
> > > > >
> > > > > Setting GC_no_dls to 1 before initializing the GC causes the
> > > > > segfault to dissapear.
> > > > I'm concerned that this perturbs things just enough for the
> > > problem to disappear, without actually removong the cause.
> > > Can you call GC_dump() or GC_print_static_roots() at this
> point, and
> > > see whether a block around that region is registered as
> part of the
> > > root set?  It might also be worth looking at
> /proc/<pid>/maps to see
> > > if there are any other mappings close to there, though it
> looks like
> > > there aren't.
> > >
> > > Program received signal SIGSEGV, Segmentation fault.
> > > [Switching to Thread 0x2b1fb0d365f0 (LWP 13558)] GC_mark_from
> > > (mark_stack_top=0x79d180, mark_stack=0x79d000,
> > >     mark_stack_limit=0x7ad000) at mark.c:796
> > > 796               deferred = *(word *)limit;
> > >
> > > (gdb) print limit
> > > $1 = 0xb55358 <Address 0xb55358 out of bounds>
> > >
> > > This access causes the segfault
> > >
> > > (gdb) call GC_dump()
> > > ***Static roots:
> > > From 0x727d90 to 0x75ba40  (temporary) From 0x2b1fafb0cc00 to
> > > 0x2b1fafb15990  (temporary) From
> > > 0x2b1fafd2acd8 to 0x2b1fafd2b230  (temporary) From
> > > 0x2b1faff33d78 to 0x2b1faff34bf0  (temporary) From
> 0x2b1fb014ad80 to
> > > 0x2b1fb014dab0  (temporary) From
> > > 0x2b1fb0350d68 to 0x2b1fb0351100  (temporary) From
> 0x2b1fb05a6dd0 to
> > > 0x2b1fb05a70b8  (temporary) From
> > > 0x2b1fb07bfbc8 to 0x2b1fb07c4370  (temporary) From
> > > 0x2b1fb09dadd8 to 0x2b1fb09db3f8  (temporary) From
> > > 0x2b1fb0d2b768 to 0x2b1fb0d342b8  (temporary) From
> 0x2b1faf8cdbc0 to
> > > 0x2b1faf8ceca8  (temporary) From
> > > 0x2b1fb1338e18 to 0x2b1fb1339030  (temporary) Total size: 327128
> > >
> > > ***Heap sections:
> > > Total heap size: 958464
> > > Section 0 from 0x7ad000 to 0x7bd000 0/16 blacklisted Section
> > > 1 from 0x7bd000 to 0x7e3000 0/38 blacklisted Section 2
> from 0x944000
> > > to 0x957000 0/19 blacklisted Section 3 from 0x957000 to 0x970000
> > > 0/25 blacklisted Section 4 from 0x980000 to 0x9a1000 0/33
> > > blacklisted Section 5 from 0xaee000 to 0xb1a000
> > > 0/44 blacklisted Section 6 from 0xb1a000 to 0xb55000 0/59
> > > blacklisted
> > >
> > > ***Free blocks:
> > > Free list 1 (Total size 45056):
> > >         0xb16000 size 4096 not black listed
> > >         0xb1e000 size 4096 not black listed
> > >         0x944000 size 4096 not black listed
> > >         0x95e000 size 4096 not black listed
> > >         0x961000 size 4096 not black listed
> > >         0x964000 size 4096 not black listed
> > >         0x968000 size 4096 not black listed
> > >         0x96c000 size 4096 not black listed
> > >         0x980000 size 4096 not black listed
> > >         0xaf1000 size 4096 not black listed
> > >         0xaf3000 size 4096 not black listed Free list 2
> (Total size
> > > 8192):
> > >         0xaee000 size 8192 not black listed Free list 4
> (Total size
> > > 16384):
> > >         0x99d000 size 16384 not black listed Free list 5
> (Total size
> > > 40960):
> > >         0xb50000 size 20480 not black listed
> > >         0xb18000 size 20480 not black listed Free list 33 (Total
> > > size 163840):
> > >         0x7b1000 size 163840 not black listed Total of
> 274432 bytes
> > > on free list
> > >
> > > ***Blocks in use:
> > > (kind(0=ptrfree,1=normal,2=unc.):size_in_bytes, #_marks_set)
> > > (4:64,26)(4:48,33)(2:8160,1)(4:80,9)(4:160,1)(1:48,44)(4:112,3
> > > 1)(0:1344,2)(1:96,42)(1:64,3)(1:16,2)(1:368,1)(2:48,2!=172)(2:
> > 131088,1)(1:96,42)(4:2096,1)(4:48,2)(1:64,20)(4:64,1)(4:96,0)(1:>
> > 32,0)(1:48,0)(4:32,14)(4:64,0)(4:48,1)(4:368,0)(4:80,0)(4:112,
> > > 1)(4:208,0)(1:96,42)(4:48,0)(4:2176,1)(2:20496,1)(4:2176,1)(4:
> > > 64,0)(1:32,12)(4:144,0)(4:80,6)(4:64,2)(1:1344,1)(4:112,36)(4:
> > > 112,36)(4:112,36)(4:48,7)(4:48,85)(1:32,2)(4:48,85)(4:48,85)(4
> > > :48,85)(1:32,0)(1:576,0)(4:224,2)(1:96,42)(4:192,0)(1:96,42)(4
> > > :80,0)(1:96,42)(4:48,55)(4:64,1)(1:32,0)(4:48,0)(4:304,0)(4:25
> > > 6,0)(4:48,38)(4:80,0)(4:1344,2)(1:96,41)(4:80,0)(1:32,9)(1:134
> > > 4,1)(4:48,31)(4:128,0)(4:64,0)(4:2048,1)(4:64,30)(4:48,28)(4:9
> > > 6,3)(4:64,33)(4:48,40)(4:64,34)(4:2096,1)(4:64,34)(4:48,40)(4:
> > > 96,1)(4:64,33)(4:48,41)(4:64,34)(4:80,3)(4:64,37)(4:48,34)(4:6
> > > 4,38)(4:64,42)(4:48,29)(4:64,43)(4:64,35)(4:48,35)(4:64,39)(4:
> > > 64,33)(4:48,36)(4:64,33)(4:48,41)(4:64,32)(4:48,40)(4:64,35)(4
> > > :64,33)(4:48,48)(4:48,51)(4:64,20)(1:2048,1)(2:240,17!=255)(4:
> > 176,0)(1:176,1)(1:304,1)(1:4848,1)(1:4848,1)(1:96,42)(0:16,182)(>
> >
> 2:8208,1)(1:112,1)(1:32,63)(1:80,4)(1:96,42)(4:16,12)(4:32,68)(4:96,1)
> > > blocks = 125, bytes = 684032
> > >
> > > ***Finalization statistics:
> > > 3 finalization table entries; 0 disappearing links 0 objects are
> > > eligible for immediate finalization
> > >
> > > The start of /proc/<pid>/maps
> > >
> > > 00400000-00528000 r-xp 00000000 08:07
> > > 3657391
> > > /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
> > > 00727000-00728000 r-xp 00127000 08:07
> > > 3657391
> > > /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
> > > 00728000-0072a000 rwxp 00128000 08:07
> > > 3657391
> > > /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
> > > 0072a000-00b55000 rwxp 0072a000 00:00 0 [heap]
> > >
> > > All other addresses are >= 0x2b1faf6b0000
> > >
> > > > >
> > > > > So it looks like the false object is pushed on the mark
> > > stack during
> > > > > marking the static roots.
> > > > Maybe.  Another possibility is a generic bug that sometimes
> > > causes the collector to fail to notice that the heap
> block is free,
> > > if it happens to find a false reference to it.  I'll stare at the
> > > code a bit to explore that possibility.  The hb_descr
> value here is
> > > suggestive, and has me worried.
> > > >
> > > > Hans
> > > >
> > > > >
> > > > > (gdb) print mark_stack_top[0]
> > > > > $11 = {mse_start = 0xb55358 <Address 0xb55358 out of bounds>,
> > > > >   mse_descr = 109752}
> > > > >
> > > > > (gdb) print mark_stack_top[-1]
> > > > > $12 = {mse_start = 0xb2f000 "P\205\230", mse_descr = 131088}
> > > > >
> > > > > Klaus
> > > > >
> > >
> > > Klaus
> > >
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mark.c.diff
Type: application/octet-stream
Size: 1255 bytes
Desc: mark.c.diff
Url : http://napali.hpl.hp.com/pipermail/gc/attachments/20080820/e91ee2b8/mark.c.obj


More information about the Gc mailing list