[Gc] SIGSEGV in mark.c:759 (GC6.8/linux)

Boehm, Hans hans.boehm at hp.com
Tue Sep 19 16:17:55 PDT 2006


Somehow you ended up with an object on the mark stack that is not addressable, or at least not completely addressable.  It would be interesting to know

- What address is being dereferenced.  If it fails at this particular point, it will be the value of the variable "limit".

- How that object got pushed onto the mark stack.  If you don't have custom mark procedures or the like, I would guess that it is being pushed as part of the root set.  Since it only happens with threads, I would guess there may be an issue with correctly identifying the thread stacks.  Are other threads being created in this process?  Or do you just have two long-lived threads?  I would look at whether the address is in the general vicinity of other thread stacks, using /proc/<pid>/maps.  You can also determine fairly easily what the size of the offending object is, which might tell you something.

This is otherwise a default configuration?  USE_PROC_FOR_LIBRARIES is not defined?  Presumably neither is malloc interception, since otherwise you'd be unlikely to get this far with threads.  USE_PROC_FOR_LIBRARIES is probably not 100% reliable with threads in 6.8 and really shouldn't be used with threads.  The symptoms would probably be similar ...

Hans

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com 
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Alec Orr
> Sent: Tuesday, September 19, 2006 6:52 AM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] SIGSEGV in mark.c:759 (GC6.8/linux)
> 
> Good morning:
> 
> I have been seeing a SIGSEGV in GC_mark_from() when running 
> two threads at once (each works individually).  I am looking  
> for ways to troubleshoot this (we're pretty sure we're doing 
> something wrong).  We have tried to track down the cause 
> without much luck (Valgrind, Purify, dmalloc, etc).
> 
> We are aware that the GC will try to read past the stack, and 
> the SIGSEGV should be caught.  Could this SIGSEGV be caused 
> by stack corruption or a timing problem?
> 
> I am using:
> - Linux pcname 2.4.20-8 #1 Thu Mar 13 17:54:28 EST 2003 i686 
> i686 i386 GNU/Linux
> - GC 6.8 (posix threads, full-debug-enabled, gc-assertions-enabled).
> 
> Below are 5 thread stack traces, only #3 (the 1st one listed) 
> is of interest since others appear to be sleeping, or waiting 
> on a condition.
> 
> Thank you in advance for any advice,
> Alec
> 
> 
> Program received signal SIGSEGV, Segmentation fault.
> 
> Thread 3
> #0  0x4002c91d in GC_mark_from (mark_stack_top=0x8052260,
>      mark_stack=0x80520a8, mark_stack_limit=0x805a0a8) at mark.c:759
> #1  0x4002d450 in GC_mark_some (
>      cold_gc_frame=0x405c9b40 "i[\002@;&\002@\234�\003@") 
> at mark.c:361
> #2  0x400226d0 in GC_stopped_mark (stop_func=0x40021b00 
> <GC_never_stop_func>)
>      at alloc.c:531
> #3  0x40022a3d in GC_try_to_collect_inner (
>      stop_func=0x40021b00 <GC_never_stop_func>) at alloc.c:378
> #4  0x40022d46 in GC_collect_or_expand (needed_blocks=1, 
> ignore_off_page=0)
>      at alloc.c:1036
> #5  0x400298d0 in GC_alloc_large (lw=587, k=1, flags=0) at malloc.c:62
> #6  0x40029c5a in GC_generic_malloc (lb=2347, k=1) at malloc.c:206
> #7  0x40029f44 in GC_malloc (lb=2347) at malloc.c:333
> #8  0x40025526 in GC_debug_malloc (lb=2288, s=0x400d6fc8 
> "src/apiext.c", i=132)
>      at dbg_mlc.c:492
> #9  0x40025b69 in GC_debug_realloc (p=0x8176038, lb=2288,
>      s=0x400d6fc8 "src/apiext.c", i=132) at dbg_mlc.c:903 #10 
> 0x400bde4f in int_reallocMemory (pFunc=0x4030230d 
> "src/cimxml_schema.c",
>      pFile=0x4030230d "src/cimxml_schema.c", pLine=6396, 
> vptr=0x8176038,
>      size=2288, heap=0) at src/apiext.c:132
> #11 0x402e6473 in doMarkup (
>      newstring=0x8176038 "Indicates ...") at 
> src/cimxml_schema.c:6395 ...
> #26 0x403d6c7a in websUrlHandlerRequest (wp=0x8102fa8) at 
> ../handler.c:288
> #27 0x403e33d5 in websReadEvent (wp=0x8102fa8) at ../webs.c:511
> #28 0x403e309d in websSocketEvent (sid=1, mask=2, iwp=0x8102fa8)
>      at ../webs.c:343
> #29 0x403da4e3 in socketDoEvent (sp=0x8103a60) at 
> ../sockGen.c:939 #30 0x403da37d in socketProcess (sid=1) at 
> ../sockGen.c:881 ---Type <return> to continue, or q <return> 
> to quit---
> #31 0x403e6ae0 in startListening () at main.c:429
> #32 0x403e6535 in startHTTP (pReserved=0xbfffd910, pMutex=0x403c8640,
>      pCond=0x403c8660, pWorked=0x403c8690 "") at main.c:179
> #33 0x403c597e in newThread (pHandle=0x8088c28) at 
> src/cimxml_httpCPA.c:383
> #34 0x400362fb in GC_start_routine (arg=0x805fed8) at 
> pthread_support.c:1212
> #35 0x4010f2b6 in start_thread () from /lib/tls/libpthread.so.0
> #36 0x420de407 in clone () from /lib/tls/libc.so.6
> 
> Thread 1
> #0  0xffffe002 in ?? ()
> #1  0x4202776a in sigsuspend () from /lib/tls/libc.so.6
> #2  0x4003798b in GC_suspend_handler_inner (
>      sig_arg=0x1e <Address 0x1e out of bounds>) at 
> pthread_stop_world.c:212
> #3  0x40037a2e in GC_suspend_handler (sig=30) at 
> pthread_stop_world.c:148
> #4  <signal handler called>
> #5  0xffffe002 in ?? ()
> #6  0x40111379 in pthread_cond_wait@@GLIBC_2.3.2 ()
>     from /lib/tls/libpthread.so.0
> #7  0x00000000 in ?? ()
> 
> Thead 2
> #0  0xffffe002 in ?? ()
> #1  0x420277b1 in sigsuspend () from /lib/tls/libc.so.6
> #2  0x4003798b in GC_suspend_handler_inner (
>      sig_arg=0x1e <Address 0x1e out of bounds>) at 
> pthread_stop_world.c:212
> #3  0x40037a2e in GC_suspend_handler (sig=30) at 
> pthread_stop_world.c:148
> #4  <signal handler called>
> #5  0xffffe002 in ?? ()
> #6  0x420ac5b6 in nanosleep () from /lib/tls/libc.so.6
> #7  0x00000000 in ?? ()
> 
> Thread 4
> #0  0xffffe002 in ?? ()
> #1  0x420277b1 in sigsuspend () from /lib/tls/libc.so.6
> #2  0x4003798b in GC_suspend_handler_inner (
>      sig_arg=0x1e <Address 0x1e out of bounds>) at 
> pthread_stop_world.c:212
> #3  0x40037a2e in GC_suspend_handler (sig=30) at 
> pthread_stop_world.c:148
> #4  <signal handler called>
> #5  0xffffe002 in ?? ()
> #6  0x420ac5b6 in nanosleep () from /lib/tls/libc.so.6
> #7  0x00000000 in ?? ()
> 
> Thread 5
> #0  0xffffe002 in ?? ()
> #1  0x420277b1 in sigsuspend () from /lib/tls/libc.so.6
> #2  0x4003798b in GC_suspend_handler_inner (
>      sig_arg=0x1e <Address 0x1e out of bounds>) at 
> pthread_stop_world.c:212
> #3  0x40037a2e in GC_suspend_handler (sig=30) at 
> pthread_stop_world.c:148
> #4  <signal handler called>
> #5  0xffffe002 in ?? ()
> #6  0x40111504 in pthread_cond_timedwait@@GLIBC_2.3.2 ()
>     from /lib/tls/libpthread.so.0
> #7  0x2a4a1878 in ?? ()
> 
> 
> 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 



More information about the Gc mailing list