Re: [Gc] FW: GC: Time for GC final release? (draft patch for cancellation)
ivmai at mail.ru
Thu Aug 12 13:46:38 PDT 2010
I've attached the updated version of the patch.
Fri, 6 Aug 2010 09:13:47 -0700 (PDT) Hans Boehm <Hans.Boehm at hp.com>:
> On Wed, 4 Aug 2010, Ivan Maidanski wrote:
> > Hello, Hans!
> > Could you review my next version? (I even haven't compiled it - no
> > enough time these days).
> I know the problem ...
> It generally looks very good to me, based on inspection. I did not
> yet test. Thank you.
> > Also, the Q: is it ok not to check for me!=0 in pthread_cancel()?
> "Me" is not a good k name for the variable, since it refers to the...
me ;) (I've copied the code from pthread_exit but forgot to change the name.)
> ... thread being cancelled, not the thread cancelling it. I'd
> suggest "target". I think you do need to check for null, and just
> return ESRCH, or at least not access the flags field, if it is null.
I chose the later one letting the original function to handle/return error.
> pthread_exit looks fine as is.
> > Q2: What are the conds of setting GC_INTERCEPT_PTHREAD_EXIT in gcconfig.h?
> We probably shouldn't bother intercepting pthread_cancel with NO_CANCEL_SAFE.
Since [NO_]CANCEL_SAFE is a build-time macro, it's problematic to conditionally intercept pthread_cancel. So, I just pass-thru in our pthread_cancel if not CANCEL_SAFE.
> Pthread_exit should probably still be intercepted.
> My guess is that we should always intercept pthread_exit on Posix
> systems. Since we seem to have issues on both Solaris and Linux,
> it wouldn't surprise me if this problem were fairly pervasive.
> I'm not sure we need the macro. I think pthread_exit needs to be
> treated just like the other intercepted pthread calls.
Ok. Now i intercept them unconditionally on linux and solaris. I define GC_PTHREAD_EXIT_ATTRIBUTE (in gc_pthread_redirects) in case of the interception which is tested (and used) in pthread_support.c (this looks a bit ugly but probably better than introducing one more macro).
> > Thu, 29 Jul 2010 22:24:18 +0000 "Boehm, Hans" <hans.boehm at hp.com>:
> >> Sorry for being so slow. I'm usually better with easier questions :-)
> >>> -----Original Message-----
> >>> From: Ivan Maidanski [mailto:ivmai at mail.ru]
> >>> Sent: Tuesday, July 13, 2010 4:28 AM
> >>> To: Boehm, Hans
> >>> Cc: gc at linux.hpl.hp.com
> >>> Subject: Re: [Gc] FW: GC: Time for GC final release? (draft patch
> >>> for cancellation)
> >>> Hello!
> >>> This is a draft/incomplete (and NOT working) patch for case 1.
> >>> To test it, use -D GC_INTERCEPT_PTHREAD_EXIT.
> >>> It is not working because: I don't know how to do GC_enable for
> >>> pthread_exit (since it is a no-return function). In
> >>> GC_thread_exit_proc? Any ideas?
> >> I think it has to be reenabled in GC_thread_exit_proc, right after calling GC_unregister_my_thread. We probably need to add a GC_DISABLED or EXITING flag to the flags field in the thread structure, so that GC_thread_exit_proc can tell whether it needs to reenable the GC.
> >>> Q: Will pthread_cancel() interception really help us, since it it just
> >>> a send-signal routine (I mean it does not wait for the thread exiting,
> >>> AFAIK)?
> >> Yes. The problem is that if a GC is started between pthread_cancel and thread exit, GC_stop_world will block with the GC lock held, waiting for the exiting thread to respond to the signal, which it won't if it's also trying to grab the GC lock. By not starting the GC during that time, GC_stop_world() shouldn't get called in this interval, with a damaged thread around, and we should be OK.
> >>> Another question: is it enough for pthread_cancel/exit (and also
> >>> GC_start_routine) to use GC_disable, or we need something close to
> >>> disable_gc_for_dlopen?
> >> After staring at this for a while, I think GC_disable() is sufficient. In the dlopen case, an ongoing mark process might break, because the root set is changing. Thus we need to make sure that no GC mark phase is in progress. I think in the exiting case that shouldn't matter. If I'm wrong, we'll see more deadlocks.
> >> Hans
> >>> Fri, 9 Jul 2010 01:19:06 +0000 "Boehm, Hans" <hans.boehm at hp.com>:
> >>>> It seems to me that we really want two patches related to
> >>> cancellation that aren't yet in the tree:
> >>>> 1) We should deal with the fact that apparently on Solaris and
> >>> probably on Linux we can't collect while a thread is exiting, since
> >>> signals aren't handled properly. This gives currently gives rise to
> >>> deadlocks. I think the only workaround is to also intercept
> >>> pthread_cancel and pthread_exit and disable GC until the thread exit
> >>> handler is called. That's ugly, because we risk growing the heap
> >>> unnecessarily, and possibly repeatedly. But it seems that we don't
> >>> really have an option in that the process is not in a fully functional
> >>> state while a thread is exiting.
> >>>> 2) ...
> >>>> I was hoping to find some time to work on this this week. But so far,
> >>> it looks like I failed.
> >>>> These are both a bit frustrating, because I think they're really
> >>>> problems in the underlying Posix layers that are likely to also
> >>> affect
> >>>> other things. And they don't seem to admit good solutions
> >>>> Hans
> >>>>> -----Original Message-----
> >>>>> From: Ivan Maidanski [mailto:ivmai at mail.ru] ...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 10188 bytes
Desc: not available
Url : http://napali.hpl.hp.com/pipermail/gc/attachments/20100813/f1dbc817/attachment.obj
More information about the Gc