[Gc] crash in gc_mark_from with dlcose..

Boehm, Hans hans.boehm at hp.com
Thu Jan 25 14:31:22 PST 2007


[Catching up on email after a trip:]

The collector includes a dlopen wrapper (GC_dlopen()).  This does
disable GC during the call, with some other work to make sure that any
in-progress GC finishes first.  That is slightly risky in that it might
be that a collection should have happened at that point, and the heap
will now be grown instead.  Calling GC_gcollect() before GC_dlopen() is
probably safer.  (Doing it conditionally based on the amount of prior
allocation is probably even better.)

Dlclose is a bigger can of worms, as I'm sure you realize.  You need to
make sure that data segments corresponding to the dynamic library are
unregistered before the GC runs again.  A GC_dlopen-like wrapper may
suffice for that, since the collector normally rebuild the list of dl
data segments for each GC.  But there are many other opportunities to
register pointers to data and functions in the closed dynamic library
with the GC.  If any of those persist past the dlclose, the collector
will crash.  Some that come to mind are:

1) Finalizers.  If you register a finalizer that resides in the dynamic
library, you will never be able to safely dlclose it.

2) The debug interface stores string pointers (normally file names) in
objects.  If you dlclose the library, those string pointers become
dangling pointers.  Any use of debug allocation in a dynamic library
will prevent ever dlclosing it, I believe.

3) Anything with custom mark procedures and object kinds, if the
procedures or descriptors are in the library.

There are probably others, plus the usual non-GC issues of making sure
that all code residing in the library has really terminated, etc.

It may be possible to make dlclose work under very restrictive
conditions (which seems to be the best you can do even without a GC),
but I haven't explored it.  (My attitude used to be that dlclose is so
brittle anyway that it wasn't really worth dealing with, but people do
seem to occasionally make good use of it.)


Hans

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com 
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of jim marshall
> Sent: Wednesday, January 17, 2007 12:40 PM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] crash in gc_mark_from with dlcose..
> 
> Hans et al,
>  I believe we have discussed this in the past, My appoligies 
> for that but I think we have a better idea of what is 
> happening now.  I just want to see if you agree that it could 
> cause an issue and if so what we can do to work around it.
> 
> We are running on Linux (specifically Red Hat Enterprise 
> Server 3 & 4). 
> Our program is multi-threaded, GC is built for that (see 
> below). In essence we have some secondary threads that will 
> periodically perform a dlopen/dlclose on a shared object. 
> This may occur very often (e.g. 4 load/unloads in a few 
> seconds). We have consistently been crashing in gc_mark_from, 
> it is also consistent that one of these secondary threads is 
> in the process of doing a dlclose or dlopen. I've included 
> snippets of the stack traces below.
> 
> So the question is, does it seem reasonable that the GC might 
> crash while a secondary thread is in the process of a 
> dlopen/dlclose? If so the first question is, how can we tell 
> the GC to not do any collections while we are doing the 
> dlopen/dlclose?  Previously we had implemented an intercept 
> function for dlopen, and in there we called 
> 'GC_disable'/'GC_enable', but we noticed that at some point 
> the heap usage grew very alarmingly (e.g. our process would 
> jump from 8 to 30, to 50 up to 120 megs - not the real 
> numbers just using this to illustrate). 
> If this is not expected we can try and create a small repro 
> case and send it along.
> 
> Any thoughts or comments welcome, as we need to resolve this 
> before we can move forward with out product.
> 
> Thanks!
> 
> CONFIGURE INFO FOR GC BUILD
> ======================
> --prefix=/home/someuser/gc67rel
> --exec-prefix=/home/someuser/gc67rel
> --enable-gc-assertions
> --enable-static=no
> --enable-threads=posix
> 
> STACK TRACE
> =========
> Thread 1  -  You'll notice I left most of out code out of the 
> stack trace, this is because where this thread is coming from 
> in our code random, based on when the GC occurs.
> 
> #0  0xb75c68f4 in GC_mark_from ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #1  0xb75c7380 in GC_mark_some ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #2  0xb75bd0c0 in GC_stopped_mark ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #3  0xb75bd41a in GC_try_to_collect_inner ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #4  0xb75bd726 in GC_collect_or_expand ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #5  0xb75c3900 in GC_alloc_large ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #6  0xb75c3c8a in GC_generic_malloc ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #7  0xb75c3f74 in GC_malloc ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #8  0xb75bfbd6 in GC_debug_malloc ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #9  0xb7539a2b in int_allocMemory ()
>    from /usr/ws/server/cserver/lib/libcwbemapi.so
> #10 0xb72887e7 in cim_simplersp_toxml ()
> 
> 
> SECONDARY THREAD
> ==============
> 
> #0  0xb73f90de in sigsuspend () from /lib/tls/libc.so.6
> #1  0xb75d11cb in GC_suspend_handler_inner ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #2  0xb75d126e in GC_suspend_handler ()
>    from /usr/ws/server/cserver/../../lib/linux/libgc.so.1
> #3  <signal handler called>
> #4  0xb74aa4a1 in munmap () from /lib/tls/libc.so.6
> #5  0xb759a06c in dlclose_doit () from /lib/libdl.so.2
> #6  0xb759a06c in dlclose_doit () from /lib/libdl.so.2
> #7  0xb75f5896 in _dl_catch_error_internal () from /lib/ld-linux.so.2
> #8  0xb759a4b6 in _dlerror_run () from /lib/libdl.so.2
> #9  0xb759a032 in dlclose () from /lib/libdl.so.2 #10 
> 0xb64ec684 in WSIUnloadLib ()
>    from /usr/ws/server/cserver/lib/libcmpiutils.so
> #11 0xb64ec8b0 in WSIUnloadInstanceImpl ()
> 
> 
> 
> Thanks
> 
> --
> Jim Marshall
> Sr. Staff Engineer
> WBEM Solutions, Inc.
> 978-947-3607
> 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 



More information about the Gc mailing list