[Gc] Hang on Red Hat Enterprise Linux?

Boehm, Hans hans.boehm at hp.com
Mon Apr 26 17:48:28 PDT 2004


I've seen similar behavior before, usually with a slightly buggy glibc.
But I also generally don't test on anything bigger than a 4-way machine.
I do test occasionally on an ancient 4-way X86 machine with RedHat 9.
It seemed OK after I updated glibc and the kernel, though my versions are
still older than yours.

I agree that it would be good to track this down.

If you can get stack traces out of gdb for all the threads after it
hangs, and if the state is roughly repeatable, that might tell us
something.  The parallel mark code uses pthread condition variables.
It would be nice to know if it looks like we temporarily lose a
notification there.

I can't see immediately why 6.3alpha5 would help, but it's
worth trying it anyway.

Hans

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com]On Behalf Of Kenneth C. Schalk
> Sent: Monday, April 26, 2004 4:50 PM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] Hang on Red Hat Enterprise Linux?
> 
> 
> I'm working on getting some code that makes pretty significant use of
> the garbage collector (Vesta) working on a machine running:
> 
>     Red Hat Enterprise Linux AS release 3 (Taroon Update 1)
> 
> This distribution is significantly different from what I've done most
> of my testing on so far in that it uses NPTL rather than LinuxThreads.
> 
> The problem I'm experiencing is that programs linked with the garbage
> collector seems to periodically hang (go to sleep and not accumulate
> any more CPU time).  When I attach to such a program with the
> debugger, I don't see anything obviously wrong.  When I detach the
> debugger, the program starts going again (accumulating more CPU time).
> Sometimes the same program will hang more than once, and attaching and
> detaching the debugger will "un-stick" it again.  This even happens
> with gctest.
> 
> I'm using version 6.2 of the garbage collector.  (I haven't gotten to
> trying 6.3alpha5 yet, maybe tomorrow.)  The garbage collector
> configuration I'm using is:
> 
>     -DATOMIC_UNCOLLECTABLE
>     -DNO_SIGNALS
>     -DNO_EXECUTE_PERMISSION
>     -DSILENT
>     -DALL_INTERIOR_POINTERS
>     -DLARGE_CONFIG
>     -DUSE_MMAP
>     -DGC_LINUX_THREADS
>     -D_REENTRANT
>     -DTHREAD_LOCAL_ALLOC
>     -DPARALLEL_MARK
> 
> I built gctest with the classic Makefile, putting all the above in the
> "CFLAGS" definition line.
> 
> The hardware I'm seeing these problems on is a 4-way Xeon 2.80GHz
> machine with HyperThreading enabled, which looks like an 8-CPU machine
> under Linux.
> 
> I've found experimentally that the hang is less likely to occur with
> gctest when I set the GC_MARKERS environment variable to a number
> smaller than default number of mark threads (8).  So far I haven't
> seen the hang with GC_MARKERS set to 5 or less, but I have with 6-8.
> (I haven't yet tested this with the application I'm trying to get
> working.)
> 
> It seems pretty clear to me that just attaching and detaching the
> debugger shouldn't have an effect on the program's execution, and I'm
> a little suspicious that this could be a bug in either NPTL or the
> Linux kernel.  It also seems like disabling PARALLEL_MARK would make
> the problem go away, although I'd rather not do that.  If I find that
> setting GC_MARKERS is an effective work-around I may just stick with
> that, but I'd rather fix the problem, and it definitely seemed worth
> bringing up on the list.
> 
> --Ken
> 
> P.S.  A little more version information, in case it turns out to be
> relevant:
> 
> % uname -a
> Linux mmdcs100 2.4.21-9.ELsmp #1 SMP Thu Jan 8 17:08:56 EST 
> 2004 i686 i686
> i386 GNU/Linux
> % rpm -q glibc
> glibc-2.3.2-95.6
> 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 


More information about the Gc mailing list