[Gc] RE: A few higher level collector questions.

Talbot, George Gtalbot at locuspharma.com
Mon May 18 07:55:20 PDT 2009

> -----Original Message-----
> From: Boehm, Hans [mailto:hans.boehm at hp.com]
> > -----Original Message-----
> > From: gc-bounces at napali.hpl.hp.com
> > [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Talbot, George
> >
> > * How does this collector figure into the eventual addition
> > of GC to a future C++ language standard?  Is anyone working
> > on that right now?
> The C++0x draft goes out of its way to allow, but not require, a garbage
> collected implementation, at least for objects allocated with ::new.  The
> discussion about doing more than that was essentially tabled until the
> committee is done with C++0x.
> The C++0x draft requires some library functionality that's a bit different
> from what's currently in the collector.  I have initial implementations
> that I would like to check in, but I would really like to get a stable 7.2
> out before that.  Checking these in is not likely to improve stability,
> since the changes touch some existing data structures.

I'd be curious to hear about this, but just because I'm a big nerd and find the topic really interesting.  :)

> > * Does anyone have any interest or experience in using what
> > will be the eventual GCC plugin architecture to emit the
> > necessary type information for the collector to use the type
> > interface automatically for C++ programs?
> That would be interesting.  It would probably require language extensions
> to C++0x (possibly similar to what the committee considered before the
> current compromise solution), since C++0x still allows undisguised
> pointers to be stored in things like char arrays.  You probably want a
> static annotation to tell the compiler you're not doing that.  (C++0x
> provides a dynamic declare_no_pointers call, which constitutes most of the
> implementation challenge.  Mike Spertus and I have an upcomming ISMM 09
> paper on this.)

Will you be announcing the paper to the list when it comes out?  (Again, I'm curious in the nerd sense.)

> > * I like the GC_generate_random_backtrace() a lot.  One of
> > the things I thought might be interesting would be a version
> > that generates a large amount of backtraces and can output
> > perhaps a histogram of memory usage based on call site in the
> > calling program and possibly some sort of aggregate
> > dependency graphs of how particular call sites' memory are
> > referenced.  Has anyone already done something like this,
> > before I go and try to do it?
> There is a fairly substantial research literature inthis area.  You might
> look at the Cork paper by Jump and McKinley in POPL 07 and start looking
> at references from there.  Manuel Serrano and I had a closely related
> paper in ICFP 2000.

I guess the question I'm asking is slightly different:  Has anybody written a tool that I could use _right_now_ to evaluate my program and look for memory leaks?  If not, this is interesting and useful enough for me to write one, but I'd rather not if one of you folks who are 10x smarter than I am has already done so.

I mean that in all seriousness--I am not a GC expert.  My interest in the collector is purely practical.  I'm trying to write the most CPU efficient, memory efficient and correct program that I can, and I've turned to GC to do that to enable some techniques (compare-and-swap for some large composite data structures in particular) that are not quite so practical using an explicitly-memory-managed model for writing my program.

I was originally going to try to implement hazard pointers from the paper for my key data structures, but I came to the realization that hazard pointers are a sort of "GC-lite", and that it would be harder to make my program correct using hazard pointers than it would be to just put a garbage collector in it.  That's what drove me to the BDW GC, as it appears from some of the internet-related research I did that the BDW GC is the most widespread collector in use for C and C++ programs.

I've been asking the questions about the standard support, etc., because at this point:

* I'm pretty committed to GC in my program; it's solved a lot of problems.
* I'm not likely to rewrite my program in something like Java as I can't
  afford the time/effort involved and I don't want to throw away years
  of debugging time already spent.
* I would love to eventually not have to build the collector into my
  program and have the compiler do it.
* I'm finding through experience that any C++ library code that is linked
  (not compiled into) my program can be a source of memory leaks, as it's
  not "GC-aware".  (Library code that's compiled in from headers works out
  OK if I include gc_cpp.h first.)

Thanks for spending the time to respond to me.  Let me know if there's any help I can give.  Again I'd be perfectly willing to work on a memory profiling tool for the collector if one doesn't exist already.

George T. Talbot
<gtalbot at locuspharma.com>

More information about the Gc mailing list