[Gc] Race condition between thread termination and garbage
collection under Solaris 10/x86
blinke at cebitec.uni-bielefeld.de
Wed Mar 3 04:39:42 PST 2010
On Wednesday 03 March 2010, Boehm, Hans wrote:
> As far as the Solaris issue is concerned, aside from possibly filing a bug
> with Sun, maybe deferring GC somehow is still the best option. It would
> probably at least substantially reduce the deadlock frequency.
> Unfortunately, a hung exiting thread is likely to cause the process to run
> out of memory. We may want to defer for only so long. And I think we want
> to do this only on Solaris.
I've filled a bug report about this on bugs.opensolaris.org. According to the
manpages only Solaris/x86 is affected. Solaris/SPARC does not contain the
paragraph about blocked signals in the pthread_exit() manpage, but I did not
run the test on a SPARC CPU yet.
I'm currently trying to build a workaround, since a pending suspend signal can
easily be detected in the exit handler. This requires adding an 'exiting'
state to the internal GC thread struct, which indicates that a thread is
currently exiting and will be blocked by acquiring the GC lock in
GC_unregister_my_thread(). The function also has to handle stack pointer in a
way similar to the suspend signal handler. Threads in this state do not need
to be waited for in GC_stop_world().
I'll send a patch as soon as this works for me.
Nonetheless the patch will not solve other problems due to blocked signals in
cancellation handlers. According to the Solaris manpage these handlers may
only call functions that are marked as Cancel-Safe, so this is out of scope
More information about the Gc