[Gc] RE: GC ported to AIX pthreads

Boehm, Hans hans_boehm@hp.com
Mon, 2 Jun 2003 11:52:00 -0700

> Here's apparently the real place where it's dying 
> on IRIX:
> #0  0xfa479b0 in p_str () at regcomp.c:748
> #1  0x10017640 in GC_free (p=0x10178cd8) at ./../../gc/malloc.c:414
> #2  0x100074cc in reverse_test () at ./../../gc/tests/test.c:669
> #3  0x10008dd0 in run_one_test () at ./../../gc/tests/test.c:1282
> #4  0x1000948c in main () at ./../../gc/tests/test.c:1471
> It appears to be crashing in the call to BZERO, although I'm 
> not sure why. gdb 
> shows it as a SIGSEGV, but when I run it outside the debugger 
> I get the 
> message "Killed" rather than the usual "Segmentation fault", 
> and additionally 
> it fails to drop core as a normal segfault should (suggesting 
> corruption of 
> the C runtime library).
> The arguments to BZERO seem to be fine in the debugger (i.e. 
> I can access all 
> the affected data using gdb), so I suspect corruption of the 
> libc memory pages 
> implementing the library call (perhaps we fiddled some of the 
> protection bits 
> on the code pages?).
> In any case, it doesn't appear to be a trivial or obvious 
> problem (so we 
> should probably go ahead disable MPROTECT_VDB on IRIX 5 for 
> now). Of course, 
> I'd encourage someone with experience on the VDB 
> implementation to take a look 
> at it..
My guess would be that 

a) We're missing some frames in the stack trace.
b) GC_free is causing a write-protect fault, which
is probably OK.
c) (I'm really guessing here.) GC_write_fault_handler() is causing
another SIGSEGV, possibly due to writing to another write-protected
location.  At that point, SIGSEGV is blocked, generating the "killed"

You might be able to get some more information out of the system log.

In any case, I've disabled MPROTECT_VDB on Irix for now, since I
can't debug this.