[Gc] Hang/SIGSEGV Problems on Solaris

Boehm, Hans hans.boehm at hp.com
Mon Nov 28 15:28:57 PST 2005


Speaking of oversights, I had missed your comment about there being only
one thread.

That seems rather mysterious, since it's trying to do a
syscall(SYS_sigprocmask, ...), which I don't think should ever block.

This code is trying to do a direct system call, bypassing the thread
library in order to prevent the thread library from doing any context
switches within an lwp.  It's conceivable that this stopped working.  (I
know this is nasty.  That's one reason gc7.0 does things a different
way.  When this code was first written about 10 years ago, the other
approach wasn't an option.)

Hans

-----Original Message-----
From: gc-bounces at napali.hpl.hp.com [mailto:gc-bounces at napali.hpl.hp.com]
On Behalf Of jim marshall
Sent: Monday, November 28, 2005 3:03 PM
Cc: gc at napali.hpl.hp.com
Subject: Re: [Gc] Hang/SIGSEGV Problems on Solaris


Thank you Shiro and Hans. I feel kind of stupid for not knowing that...

Jim Marshall

Boehm, Hans wrote: 
I somehow mistakenly thought I had replied to that ...

The SIGSEGV is expected and not interesting.  It should be caught.

I also cannot make much out of the first backtrace.  Can you send
backtraces for all the threads running at that point?

If this turns out to be a nontrivial issue with the Solaris threads
port, I would prefer to solidify 7.0 instead.  The 7.0 Solaris threads
code is shared with Linux, etc. and uses a different approach.  See

http://www.hpl.hp.com/personal/Hans_Boehm/gc/

for instructions to retrieve it.

(I am not aware of any major issues with the Solaris SPARC port, though
there are known problems with Solaris X86.)

Hans

  
-----Original Message-----
From: gc-bounces at napali.hpl.hp.com 
[mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of jim marshall
Sent: Monday, November 28, 2005 1:27 PM
To: gc at napali.hpl.hp.com
Subject: Re: [Gc] Hang/SIGSEGV Problems on Solaris


Can anyone suggest anything? I'm at a loss as to how to 
figure out what 
the problem is as I can not debug with GDB (see bottom part 
of message).

Thank you
Jim Marshall

jim marshall wrote:

    
Hello,
I am porting an application from Linux (Red Hat 9.0) to Solaris 2.9.
The program runs fine on Linux (and Windows), it also 
      
builds fine on 
    
Solaris. But when I run it on Solaris it seems to get 
      
"stuck" in the 
    
GC. If I run the program and attach GDB to it this is the 
      
stack trace 
    
I get (NOTE: there is only one thread):

Loaded symbols for /usr/lib/libnsl.so.1
Reading symbols from /usr/lib/libmp.so.2...done.
Loaded symbols for /usr/lib/libmp.so.2
Reading symbols from
/usr/platform/SUNW,Ultra-30/lib/libc_psr.so.1...done.
Loaded symbols for /usr/platform/SUNW,Ultra-30/lib/libc_psr.so.1
Reading symbols from /usr/lib/libthread.so.1...done.
warning: sol_thread_new_objfile: td_ta_new: Debugger service failed
Loaded symbols for /usr/lib/libthread.so.1
Retry #1:
Retry #2:
Retry #3:
Retry #4:
[New LWP 1]
Symbols already loaded for /usr/lib/libc.so.1
Symbols already loaded for 
      
/files/wbem/wsi/cserver/cserver/lib/libgc.so.1
    
Symbols already loaded for /usr/lib/libdl.so.1
Symbols already loaded for 
/files/wbem/wsi/output/common/lib/debug/libwsilib.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libcimom.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libcwbemapi.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libinireader.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libutils.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libWSIaf.so
Symbols already loaded for 
/files/wbem/wsi/cserver/cserver/lib/libWSIsf.so
Symbols already loaded for /usr/lib/libpthread.so.1
Symbols already loaded for /usr/local/lib/libgcc_s.so.1
Symbols already loaded for /usr/lib/libsocket.so.1
Symbols already loaded for /usr/lib/libnsl.so.1
Symbols already loaded for /usr/lib/libmp.so.2
Symbols already loaded for 
      
/usr/platform/SUNW,Ultra-30/lib/libc_psr.so.1
    
Symbols already loaded for /usr/lib/libthread.so.1
0xff31ca10 in syscall () from /usr/lib/libc.so.1
(gdb) bt
#0  0xff31ca10 in syscall () from /usr/lib/libc.so.1
#1  0xff36964c in preempt_off () at solaris_threads.c:101
#2  0xff36a36c in GC_stop_world () at solaris_threads.c:403

If I run the program directly in gdb I get a sigsegv:
(gdb) r
Starting program: /files/wbem/wsi/cserver/cserver/bin/wsicimom
./wsicimom.ini
[New LWP 1]
[New LWP 2]

Program received signal SIGSEGV, Segmentation fault. 0xff3671d8 in 
GC_find_limit (p=0xffbff594 "", up=1) at os_dep.c:808
808                     GC_noop1((word)(*result));
(gdb) bt
#0  0xff3671d8 in GC_find_limit (p=0xffbff594 "", up=1) at 
os_dep.c:808 #1  0xff36724c in GC_get_stack_base () at 
      
os_dep.c:1043 
    
#2  0xff365f34 in GC_init_inner () at misc.c:676
(gdb) thread 2
[Switching to thread 2 (LWP    2        )]0xfef65b08 in 
      
_thr_setup ()
    
  from /usr/lib/libthread.so.1
(gdb) bt
#0  0xfef65b08 in _thr_setup () from /usr/lib/libthread.so.1


About my program and system
We are on a Sparc Ultra-30
Running Solaris 2.9 (uname -a = SunOS wbem-sparc 5.9 
      
Generic_112233-01
    
sun4u sparc SUNW,Ultra-30)
I do not believe we have any patches installed.
Program does use Posix threads
using GC 6.6 built in release, 'make check' succeeds saying "all 1 
tests passed"
GC was built with:
./configure --prefix=~/gcrel --exec-prefix=~/gcrel 
--enable-gc-assertions --enable-full-debug --enable-static=no 
--enable-threads=posix

Any thoughts or suggestions on what might be happening?

Thank you
Jim Marshall







_______________________________________________
Gc mailing list
Gc at linux.hpl.hp.com 
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/



      
_______________________________________________
Gc mailing list
Gc at linux.hpl.hp.com 
http://www.hpl.hp.com/hosted/linux/mail-archives/gc/

    



  



More information about the Gc mailing list