[Gc] Threaded GC questions

Boehm, Hans hans.boehm at hp.com
Tue Aug 16 17:27:43 PDT 2005



> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com 
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Travis Griggs
> Sent: Tuesday, August 16, 2005 12:07 AM
> To: gc at napali.hpl.hp.com
> Subject: [Gc] Threaded GC questions
> 
> 
> We have a soft-real time program that uses pthreads and 
> libgc. For the 
> most part, we've been really pleased with the performance so far. But 
> of late, we've noticed it's possible that some of our threads 
> just seem 
> to dead lock. I haven't got a chance to use top/ps/etc, to 
> really check 
> out what exactly the thread states are yet... after I read 
> <http://www.hpl.hp.com/personal/Hans_Boehm/gc/gcdescr.html> and 
> <http://www.hpl.hp.com/personal/Hans_Boehm/gc/scale.html> though, I 
> ended up with a number of things I was hoping to get clarified:
> 
> 1) Is it sufficient to use a -DGC_THREADS to get the right semantics 
> for libgc and pthreads? Or does one still need to worry about having 
> the one header file preceed the other? I use debian linux.
GC_THREADS should be defined before gc.h is included.  And gc.h
must usually be included in files that create threads, so that
pthread_create
can be intercepted and threads registered with the GC.  (If you notice
mistakes/anachronisms in the documentation files, patches are always
appreciated.)
> 
> 2) Mutator threads need to periodically suspend all other threads, it 
> sounds like this is done via the use of program wide signal, and 
> specialized signal handlers which yield/spin/whatever so that the 
> mutator is not disturbed during collection. I'm sure this is fine for 
> normal linux scheduling semantics, but what if one is using 
> pthread_sched() to make some of the threads sched_rr and/or 
> sched_fifo 
> at higher levels. Is there a possibility for priority 
> inversion in this 
> case? I'm curious what other signal interactions one can expect as 
> well...
Suspended threads normally wait for a second signal.  In rare cases,
they might sleep.  I don't immediately see any priority inversion
issues.

The default locking strategy without thread-local allocation does
use custom locks that spin/yield/sleep.  I think that should be
avoidable by defining USE_PTHREAD_LOCKS (a rarely tested option).
> 
> 3) The mentioned page talks about the thread local allocation 
> strategy. 
> Am I right in understanding that when using this, the degree of 
> intrathread synchronization is reduced (or is it removed completely?) 
> at the expense of slightly higher allocation times. Do I have to use 
> both -DTHREAD_LOCAL_ALLOC as well as include gc_local_alloc.h? Or is 
> the first enough? Or do I have to actually rebuild the gc lib? Can 
> thread_local allocation be mixed with non?
This is changing with 7.0.  Under 6.x, you have to build the GC with
THREAD_LOCAL_ALLOC defined, and the call the custom allocation functions
from gc_local_alloc.h (or include that file after defining
GC_REDIRECT_TO_LOCAL, and the use the uppercase names).

In (expermental!) 7.0alpha4, if you build with THREAD_LOCAL_ALLOC
defined,
GC_malloc gets you thread-local allocation.

Thread-local allocation is usually a good thing, both for allocation
times and processor scalability.  It should really be the default if
you need thread support.  It may cost you a bit of space.
> 
> I'd appreciate any insights/suggestions on what the best possible way 
> to mate libgc with our program is:
> 
> An example instance of our program might be running as many as 10+ 
> threads at 4 different priority levels. The lowest level (normal 
> pthread scheduling semantics) is where the socket server runs at, 
> accepts new connections (creates a thread per) and services outside 
> requests. These connections rarely allocate memory, usually a large 
> structure or two. Mostly they just facilitate queries against the run 
> state.
Without parallel marking, the collector just runs inside the thread
that triggered the GC.  This may not be a good thing in your
environment,
since its priority will be unpredictable.  However, once it
acquires the allocation lock, I'd expect it to run as everything
else blocks waiting on the lock.  I'm not sure how this would
account for a deadlock.

It would be nice to see the (relevant pieces of) the thread stacks
after a deadlock.  Usually it's fairly easy to tell what went wrong.

Note that depending on your gdb and gc version, gdb itself may induce
deadlocks.  The most recent gc versions work around one of the
problems there, but there may be others.
> 
> The next thread level, the first "real time" priority would 
> be where 4 
> threads are running at. This is were the high frequency allocation of 
> short lived memory is taking place. Lots of little ( < 64 bytes per) 
> structures. Each threads is independent of the other. IOW, the 
> allocated structures are not used at all between them (this is what 
> leads me to believe that the thread local stuff would be a bonus). 
> These threads use pthread_cond_vars to synchronize with the higher 
> level threads (no known issues with some of the pthread 
> synchronization 
> mechanisms and libgc is there?).
> 
> Five high level threads run at priorities above even these. They 
> basically are user space drivers which interact with data-acquisition 
> hardware. These threads allocate nothing. The check for the existance 
> of some of the structures allocated dynamically by the interaction 
> threads to decide whether to keep running. Other than that, they have 
> no interaction with anything GC-ish. They use ring buffers to make 
> their raw data available to the GC intense processing threads 
> and need 
> to run pretty much interrupted and need to make sure they have low 
> latency. We haven't actually had any problems (at least I 
> don't think) 
> with latency, or anything like that. I have toyed with hoisting these 
> processes out of the program into separate programs and then placing 
> the ring buffers in shared memory so there is no GC affect on them at 
> all (does shared memory create problems for libgc?).
> 
> As I said, any thoughts/suggestions/hints are greatly appreciated. I 
> really do love having libgc available. It's made a semi complicated 
> problem much easier to deal with. When I'm not working on soft real 
> time C programs, I'm working in Smalltalk, and I think I've 
> discovered 
> that if I had to pick between bolting on objects to C vs. 
> bolting on GC 
> to C, I find the latter is worth more (not that I mind the 
> other part). 
> Thanks to all who've made libgc available.
> 
> --
> Travis Griggs
> Objologist
> One man's blue plane is another man's pink plane.
> 
> 
> -----------------------------------------
> DISCLAIMER: This email is bound by the terms and conditions 
> described at http://www.key.net/disclaimer.htm
> 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com 
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 



More information about the Gc mailing list