[Gc] Threaded GC questions

Travis Griggs tgriggs at key.net
Wed Aug 17 10:31:07 PDT 2005


On Aug 16, 2005, at 17:27, Boehm, Hans wrote:

>>
>> 1) Is it sufficient to use a -DGC_THREADS to get the right semantics
>> for libgc and pthreads? Or does one still need to worry about having
>> the one header file preceed the other? I use debian linux.
> GC_THREADS should be defined before gc.h is included.  And gc.h
> must usually be included in files that create threads, so that
> pthread_create
> can be intercepted and threads registered with the GC.  (If you notice
> mistakes/anachronisms in the documentation files, patches are always
> appreciated.)

OK, thanks for this.

>>
>> 2) Mutator threads need to periodically suspend all other threads, it
>> sounds like this is done via the use of program wide signal, and
>> specialized signal handlers which yield/spin/whatever so that the
>> mutator is not disturbed during collection. I'm sure this is fine for
>> normal linux scheduling semantics, but what if one is using
>> pthread_sched() to make some of the threads sched_rr and/or
>> sched_fifo
>> at higher levels. Is there a possibility for priority
>> inversion in this
>> case? I'm curious what other signal interactions one can expect as
>> well...
> Suspended threads normally wait for a second signal.  In rare cases,
> they might sleep.  I don't immediately see any priority inversion
> issues.
>
> The default locking strategy without thread-local allocation does
> use custom locks that spin/yield/sleep.  I think that should be
> avoidable by defining USE_PTHREAD_LOCKS (a rarely tested option).

Is this something the client program needs to define when linking  
against it? Or does libgc have to be built with this option? I think  
I'd like to look at using this. Not because I don't trust your locks  
:), but because I've already got to worry about the pthread stuff, I'd  
like to just keep worrying about that and not anything else.

>>
>> 3) The mentioned page talks about the thread local allocation
>> strategy.
>> Am I right in understanding that when using this, the degree of
>> intrathread synchronization is reduced (or is it removed completely?)
>> at the expense of slightly higher allocation times. Do I have to use
>> both -DTHREAD_LOCAL_ALLOC as well as include gc_local_alloc.h? Or is
>> the first enough? Or do I have to actually rebuild the gc lib? Can
>> thread_local allocation be mixed with non?
> This is changing with 7.0.  Under 6.x, you have to build the GC with
> THREAD_LOCAL_ALLOC defined, and the call the custom allocation  
> functions
> from gc_local_alloc.h (or include that file after defining
> GC_REDIRECT_TO_LOCAL, and the use the uppercase names).
>
> In (expermental!) 7.0alpha4, if you build with THREAD_LOCAL_ALLOC
> defined,
> GC_malloc gets you thread-local allocation.

Build the collector? Or the client program? If all I have to do as add  
-DTHREAD_LOCAL_ALLOC to my Makefile, that's a desirable thing. :)

>
> Thread-local allocation is usually a good thing, both for allocation
> times and processor scalability.  It should really be the default if
> you need thread support.  It may cost you a bit of space.

OK, I'm interested. Is this site  
(<http://www.hpl.hp.com/personal/Hans_Boehm/gc/gc_source/experimental/ 
 >) the place to get it. How close to solid release is this?

Does thread local allocation remove the need to pause all other  
threads? The space I have to spare.

>>
>> I'd appreciate any insights/suggestions on what the best possible way
>> to mate libgc with our program is:
>>
>> An example instance of our program might be running as many as 10+
>> threads at 4 different priority levels. The lowest level (normal
>> pthread scheduling semantics) is where the socket server runs at,
>> accepts new connections (creates a thread per) and services outside
>> requests. These connections rarely allocate memory, usually a large
>> structure or two. Mostly they just facilitate queries against the run
>> state.
> Without parallel marking, the collector just runs inside the thread
> that triggered the GC.  This may not be a good thing in your
> environment,
> since its priority will be unpredictable.  However, once it
> acquires the allocation lock, I'd expect it to run as everything
> else blocks waiting on the lock.  I'm not sure how this would
> account for a deadlock.

So using parallel marking sounds like a good thing for us probably.  
I'll look around, same questions here I guess. Is it an aspect of the  
way the library is built or the client?

What about incremental? One of the web pages I read said something  
about "most applications are fine as is". I decided for the time to run  
that way, but wondered if I'd want that long term.

One of the things working with the Smalltalk garbage collector has  
taught me is that simple things are simple to tune, but anything  
complex, you really want some way of profiling what's going on to be  
able to tune it. What (if any) tools or techniques can one use to  
determine how well the collector is working with one's environment? For  
example, one of the things I've wondered... how often is it occurring,  
and how long is it taking? Also, we've seen some processor times lately  
that showed our CPU usage climbing over time. We wondered "is this a  
heap fragmentation thing?" Is there a way to prove/disprove this?

>
> It would be nice to see the (relevant pieces of) the thread stacks
> after a deadlock.  Usually it's fairly easy to tell what went wrong.

I'll see if I can grab these.

> Note that depending on your gdb and gc version, gdb itself may induce
> deadlocks.  The most recent gc versions work around one of the
> problems there, but there may be others.

Thanks for the reply Hans.

--
Travis Griggs
Objologist
"It had better be a pretty good meeting, to be better than no meeting
at all" -- Boyd K Packer


-----------------------------------------
DISCLAIMER: This email is bound by the terms and conditions described
at
http://www.key.net/disclaimer.htm



More information about the Gc mailing list