[Gc] Race condition between thread termination and garbage collection under Solaris 10/x86

Burkhard Linke blinke at cebitec.uni-bielefeld.de
Wed Mar 3 04:39:42 PST 2010


On Wednesday 03 March 2010, Boehm, Hans wrote:
> Yucch.


> As far as the Solaris issue is concerned, aside from possibly filing a bug
> with Sun, maybe deferring GC somehow is still the best option.  It would
> probably at least substantially reduce the deadlock frequency. 
> Unfortunately, a hung exiting thread is likely to cause the process to run
> out of memory.  We may want to defer for only so long.  And I think we want
> to do this only on Solaris.

I've filled a bug report about this on bugs.opensolaris.org. According to the 
manpages only Solaris/x86 is affected. Solaris/SPARC does not contain the 
paragraph about blocked signals in the pthread_exit() manpage, but I did not 
run the test on a SPARC CPU yet.

I'm currently trying to build a workaround, since a pending suspend signal can 
easily be detected in the exit handler. This requires adding an 'exiting' 
state to the internal GC thread struct, which indicates that a thread is 
currently exiting and will be blocked by acquiring the GC lock in 
GC_unregister_my_thread(). The function also has to handle stack pointer in a 
way similar to the suspend signal handler. Threads in this state do not need 
to be waited for in GC_stop_world().

I'll send a patch as soon as this works for me.

Nonetheless the patch will not solve other problems due to blocked signals in 
cancellation handlers. According to the Solaris manpage these handlers may 
only call functions that are marked as Cancel-Safe, so this is out of scope 
of libgc.


More information about the Gc mailing list