[Gc] Segfault in GC_mark_from in libgc 7.1 (released tarball)

Klaus Treichel ktreichel at web.de
Tue Aug 19 14:11:25 PDT 2008


Am Dienstag, den 19.08.2008, 20:00 +0000 schrieb Boehm, Hans:
> 
> > -----Original Message-----
> > From: ktreichel at web.de [mailto:ktreichel at web.de]
> > Sent: Tuesday, August 19, 2008 12:35 AM
> > To: Boehm, Hans
> > Cc: Bruce Hoult; gc at napali.hpl.hp.com
> > Subject: RE: [Gc] Segfault in GC_mark_from in libgc 7.1
> > (released tarball)
> >
> > Am Montag, den 18.08.2008, 20:18 +0000 schrieb Boehm, Hans:
> > >
> > > > -----Original Message-----
> > > > From: gc-bounces at napali.hpl.hp.com
> > > > [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Klaus Treichel
> > > > Sent: Sunday, August 17, 2008 4:23 AM
> > > > To: Bruce Hoult
> > > > Cc: gc at napali.hpl.hp.com
> > > > Subject: Re: [Gc] Segfault in GC_mark_from in libgc 7.1 (released
> > > > tarball)
> > > >
> > > > Am Mittwoch, den 13.08.2008, 09:39 +0200 schrieb Klaus Treichel:
> > > > > Am Mittwoch, den 13.08.2008, 10:17 +1200 schrieb Bruce Hoult:
> > > > > > 2008/8/13 Klaus Treichel <ktreichel at web.de>:
> > > > > > > Hi,
> > > > > > >
> > > > > > > what i found out until now is:
> > > > > > >
> > > > > > > 1. limit is an inaccessible address
> > > > > > > (gdb) print limit
> > > > > > > $26 = 0xb55010 <Address 0xb55010 out of bounds>
> > > > > > >
> > > > > > > where 0xb54fff is accessible.
> > > > > > >
> > > > > > > 2. limit is in the range between least_ha and
> > > > greatest_ha so the
> > > > > > > check doesn't prevent the segfault.
> > > That check should never be needed to prevent a segfault.
> > GC_least_plausible_heap_addr and
> > GC_greatest_plausible_heap_addr are used to:
> > >
> > What are the lines like
> >       if ((ptr_t)current >= least_ha && (ptr_t)current <
> > greatest_ha) { for?
> It's a quick plausibility check that statistically eliminates a large fraction of candidate pointers without going through the more expensive real pointer validation process.  In particular, small integers should always fail this check.  On 64-bit machines, almost all non-pointers should fail.
> 
> If a candidate pointer passes this test, and fails a later more precise one, it is tracked as a "near miss", and we avoid allocating memory that would later appear to be referenced by it.
> 
> >
> > > a) Eliminate obviously implausible "candidate pointers" and
> > hence speed up pointer validation.
> > > b) Detect "near misses" in support of the blacklisting code.
> > >
> > > It would be interesting to know what
> > GC_find_header(0xb54fff) is.  Does it think this block is in
> > the heap?  Or is this ostensibly part of the root set.  If
> > it's non-null. i.e. if the block is in the heap, what's
> > *GC_find_header(0xb54fff)?
> >
> > The block is in the GC heap.
> Sort of. hb_flags = 4 means it's a free block.
> 
> >
> > ***Heap sections:
> > Total heap size: 958464
> > Section 0 from 0x7ad000 to 0x7bd000 0/16 blacklisted Section
> > 1 from 0x7bd000 to 0x7e3000 0/38 blacklisted Section 2 from
> > 0x944000 to 0x957000 0/19 blacklisted Section 3 from 0x957000
> > to 0x970000 0/25 blacklisted Section 4 from 0x980000 to
> > 0x9a1000 0/33 blacklisted Section 5 from 0xaee000 to 0xb1a000
> > 0/44 blacklisted Section 6 from 0xb1a000 to 0xb55000 0/59 blacklisted
> >
> > (gdb) print *GC_find_header(0xb50000)
> > $8 = {hb_next = 0xb18000, hb_prev = 0x0, hb_block = 0x7b2000,
> >   hb_obj_kind = 2 '\002', hb_flags = 4 '\004', hb_last_reclaimed = 1,
> >   hb_sz = 20480, hb_descr = 131088, hb_large_block = 1 '\001',
> >   hb_map = 0x798550, hb_n_marks = 1, hb_marks = {1, 0, 0, 0, 1}}
> >
> > For locations >= 0xb51000 GC_find_header returns 0.
> >
> > Setting GC_no_dls to 1 before initializing the GC causes the
> > segfault to dissapear.
> I'm concerned that this perturbs things just enough for the problem to disappear, without actually removong the cause.  Can you call GC_dump() or GC_print_static_roots() at this point, and see whether a block around that region is registered as part of the root set?  It might also be worth looking at /proc/<pid>/maps to see if there are any other mappings close to there, though it looks like there aren't.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2b1fb0d365f0 (LWP 13558)]
GC_mark_from (mark_stack_top=0x79d180, mark_stack=0x79d000,
    mark_stack_limit=0x7ad000) at mark.c:796
796               deferred = *(word *)limit;

(gdb) print limit
$1 = 0xb55358 <Address 0xb55358 out of bounds>

This access causes the segfault

(gdb) call GC_dump()
***Static roots:
From 0x727d90 to 0x75ba40  (temporary)
From 0x2b1fafb0cc00 to 0x2b1fafb15990  (temporary)
From 0x2b1fafd2acd8 to 0x2b1fafd2b230  (temporary)
From 0x2b1faff33d78 to 0x2b1faff34bf0  (temporary)
From 0x2b1fb014ad80 to 0x2b1fb014dab0  (temporary)
From 0x2b1fb0350d68 to 0x2b1fb0351100  (temporary)
From 0x2b1fb05a6dd0 to 0x2b1fb05a70b8  (temporary)
From 0x2b1fb07bfbc8 to 0x2b1fb07c4370  (temporary)
From 0x2b1fb09dadd8 to 0x2b1fb09db3f8  (temporary)
From 0x2b1fb0d2b768 to 0x2b1fb0d342b8  (temporary)
From 0x2b1faf8cdbc0 to 0x2b1faf8ceca8  (temporary)
From 0x2b1fb1338e18 to 0x2b1fb1339030  (temporary)
Total size: 327128

***Heap sections:
Total heap size: 958464
Section 0 from 0x7ad000 to 0x7bd000 0/16 blacklisted
Section 1 from 0x7bd000 to 0x7e3000 0/38 blacklisted
Section 2 from 0x944000 to 0x957000 0/19 blacklisted
Section 3 from 0x957000 to 0x970000 0/25 blacklisted
Section 4 from 0x980000 to 0x9a1000 0/33 blacklisted
Section 5 from 0xaee000 to 0xb1a000 0/44 blacklisted
Section 6 from 0xb1a000 to 0xb55000 0/59 blacklisted

***Free blocks:
Free list 1 (Total size 45056):
        0xb16000 size 4096 not black listed
        0xb1e000 size 4096 not black listed
        0x944000 size 4096 not black listed
        0x95e000 size 4096 not black listed
        0x961000 size 4096 not black listed
        0x964000 size 4096 not black listed
        0x968000 size 4096 not black listed
        0x96c000 size 4096 not black listed
        0x980000 size 4096 not black listed
        0xaf1000 size 4096 not black listed
        0xaf3000 size 4096 not black listed
Free list 2 (Total size 8192):
        0xaee000 size 8192 not black listed
Free list 4 (Total size 16384):
        0x99d000 size 16384 not black listed
Free list 5 (Total size 40960):
        0xb50000 size 20480 not black listed
        0xb18000 size 20480 not black listed
Free list 33 (Total size 163840):
        0x7b1000 size 163840 not black listed
Total of 274432 bytes on free list

***Blocks in use:
(kind(0=ptrfree,1=normal,2=unc.):size_in_bytes, #_marks_set)
(4:64,26)(4:48,33)(2:8160,1)(4:80,9)(4:160,1)(1:48,44)(4:112,31)(0:1344,2)(1:96,42)(1:64,3)(1:16,2)(1:368,1)(2:48,2!=172)(2:131088,1)(1:96,42)(4:2096,1)(4:48,2)(1:64,20)(4:64,1)(4:96,0)(1:32,0)(1:48,0)(4:32,14)(4:64,0)(4:48,1)(4:368,0)(4:80,0)(4:112,1)(4:208,0)(1:96,42)(4:48,0)(4:2176,1)(2:20496,1)(4:2176,1)(4:64,0)(1:32,12)(4:144,0)(4:80,6)(4:64,2)(1:1344,1)(4:112,36)(4:112,36)(4:112,36)(4:48,7)(4:48,85)(1:32,2)(4:48,85)(4:48,85)(4:48,85)(1:32,0)(1:576,0)(4:224,2)(1:96,42)(4:192,0)(1:96,42)(4:80,0)(1:96,42)(4:48,55)(4:64,1)(1:32,0)(4:48,0)(4:304,0)(4:256,0)(4:48,38)(4:80,0)(4:1344,2)(1:96,41)(4:80,0)(1:32,9)(1:1344,1)(4:48,31)(4:128,0)(4:64,0)(4:2048,1)(4:64,30)(4:48,28)(4:96,3)(4:64,33)(4:48,40)(4:64,34)(4:2096,1)(4:64,34)(4:48,40)(4:96,1)(4:64,33)(4:48,41)(4:64,34)(4:80,3)(4:64,37)(4:48,34)(4:64,38)(4:64,42)(4:48,29)(4:64,43)(4:64,35)(4:48,35)(4:64,39)(4:64,33)(4:48,36)(4:64,33)(4:48,41)(4:64,32)(4:48,40)(4:64,35)(4:64,33)(4:48,48)(4:48,51)(4:64,20)(1:2048,1)(2:240,17!=255)(4:176,0)(1:176,1)(1:304,1)(1:4848,1)(1:4848,1)(1:96,42)(0:16,182)(2:8208,1)(1:112,1)(1:32,63)(1:80,4)(1:96,42)(4:16,12)(4:32,68)(4:96,1)
blocks = 125, bytes = 684032

***Finalization statistics:
3 finalization table entries; 0 disappearing links
0 objects are eligible for immediate finalization

The start of /proc/<pid>/maps

00400000-00528000 r-xp 00000000 08:07
3657391                            /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
00727000-00728000 r-xp 00127000 08:07
3657391                            /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
00728000-0072a000 rwxp 00128000 08:07
3657391                            /home/kt/Downloads/DotNet/CVS/pnet-cvs/pnet/engine/ilrun
0072a000-00b55000 rwxp 0072a000 00:00 0
[heap]

All other addresses are >= 0x2b1faf6b0000

> >
> > So it looks like the false object is pushed on the mark stack
> > during marking the static roots.
> Maybe.  Another possibility is a generic bug that sometimes causes the collector to fail to notice that the heap block is free, if it happens to find a false reference to it.  I'll stare at the code a bit to explore that possibility.  The hb_descr value here is suggestive, and has me worried.
> 
> Hans
> 
> >
> > (gdb) print mark_stack_top[0]
> > $11 = {mse_start = 0xb55358 <Address 0xb55358 out of bounds>,
> >   mse_descr = 109752}
> >
> > (gdb) print mark_stack_top[-1]
> > $12 = {mse_start = 0xb2f000 "P\205\230", mse_descr = 131088}
> >
> > Klaus
> >

Klaus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Dies ist ein digital signierter Nachrichtenteil
Url : http://napali.hpl.hp.com/pipermail/gc/attachments/20080819/9fa4e55e/attachment-0001.pgp


More information about the Gc mailing list