[Gc] SIGSEGVs avoided by calling GC_expand_hp

Jan Stępień jan at stepien.cc
Sun Jan 9 11:36:30 PST 2011


On Sun, 09 Jan 2011 14:04:47 +0300 Ivan Maidanski <ivmai at mail.ru> wrote:
> Sun, 9 Jan 2011 11:53:37 +0100 Jan St?pie? <jan at stepien.cc>:
> > On Sun, 9 Jan 2011 00:25:07 +0100 Jan St?pie? <jan at stepien.cc> wrote:
> > > On Sun, 09 Jan 2011 00:41:19 +0300
> > > Ivan Maidanski <ivmai at mail.ru> wrote:
> > > > > During work on my thesis I've been developing a C application
> > > > > which uses both the GC and GLib. I've encountered a problem which seems
> > > > > to occur when a lot of small allocations are executed.
> > > > > 
> > > > > I'm on 32 bit GNU/Linux 2.6.35. I've configured gc-7.1 with
> > > > > --disable-threads, --disable-cplusplus and --disable-shared and built a
> > > > > static library. I've instructed GLib to use GC's allocation functions
> > > > > instead of the ones from glibc and called GC_INIT before allocating
> > > > > anything.
> > > > > 
> > > > > The SIGSEGV in GC_malloc_atomic is received at line malloc.c:225.
> > > > > 
> > > > > *opp = obj_link(op);
> > > > > 
> > > > > After checking in gdb it appears that the op variable tends to have
> > > > > small integer values lesser than 0xf00 which clearly aren't pointers.
> > > > 
> > > > In other words, GC_aobjfreelist contains a corrupted link pointer. Could
> > you debug and find out who places that value to GC_aobjfreelist?
> > > > My guess is: GC had collected an object (which is now in GC_aobjfreelist)
> > that is reachable from some untraceable memory and the application has
> > modified that object (first word of it) some later (thus producing that
> > corrupted link pointer).
> > > > 
> > > > Also, it would be good if you fetch the recent BDWGC snapshot from CVS and
> > test whether the problem still exists. (If you don't want to fetch from the
> > CVS, here's the snapshot as a tar-ball -
> > http://www.ivmaisoft.com/_mirror/hpl/bdwgc-7_2alpha5-20110107.tar.bz2).
> > > 
> > > Thanks for the reply. Unfortunately updating doesn't help. Currently
> > > I'm tinkering with GDB trying to figure out how does free lists work
> > > and check where and how they are altered. I'll post a follow up when
> > > I'll get some results.
> > > 
> > 
> > I did some debugging and from what I've learned it seems that
> > something -- presumably GLib -- writes to memory doubly dereferencing
> > pointers returned from GC_malloc_atomic thus corrupting free lists.
> > 
> > Assuming that that's the reason of my problems I've begun to fill all
> > memory returned from BDWGC with zeros in order to prevent access to
> > memory storing free lists. It has partially helped, as the SIGSEGV is no
> > longer received in malloc.c:225 but at mallocx.c:80, as I've written in
> > my original post:
> > 
> > On Sat, 8 Jan 2011 19:49:11 +0100 Jan St?pie? <jan at stepien.cc> wrote:
> > > A workaround I've found is to call GC_expand_hp right after calling
> > > GC_INIT. I have to pass a really big value to solve the problem. For
> > > instance after passing 1024L * 1024L the SIGSEGV is still received but
> > > at mallocx.c:80 because of dereferencing a null pointer:
> > > 
> > >   sz = hhdr -> hb_sz;
> > > 
> > > Passing 1024L * 1024L * 1024L to GC_expand_hp solves the problem but
> > > causes the program to use huge amounts of memory.
> 
> This looks like an alternative to set GC_DONT_GC=1 ;)
> 
> > 
> > hhdr in GC_realloc is a null pointer. What might be the reason of it?
> > Does it mean that I'm trying to realloc a pointer which points to
> > memory not allocated by the GC?
> 
> To say more precisely, most probably you're passing a pointer to GC_realloc which hasn't been returned by any GC_malloc function.
> 

I've finally managed to solve it. The problem was basically caused by
my ignorance regarding GLib's memory management routines. It turns out
that its memory slices do not work well with BDWGC and in fact
duplicate some of GC's mechanisms. Fortunately though, they can be
turned off and replaced with simple calls to a given malloc/free
implementation. In order to achieve it you simply have to set the
environment variable G_SLICE to always-malloc [1]

So what I've done eventually was preceding all GLib calls with a single
setenv("G_SLICE", "always-malloc", 0). It solved the problem
completely.

Thanks a lot for your help, Ivan.

[1] http://library.gnome.org/devel/glib/stable/glib-running.html#G_SLICE

-- 
Jan Stępień <jan at stepien.cc>


More information about the Gc mailing list