[Gc] workaround for C++ exceptions problem

Filip Pizlo pizlo at mac.com
Tue Jan 24 16:21:09 PST 2006


Hans,

Thanks for the reply.  This is a somewhat long mail, mainly because  
of the many possible approaches.  Hope you don't mind.

> My concern is that this is incredibly brittle.  I've gotten into  
> trouble
> before with accessing libc symbols, which someone then decided to stop
> exporting.  In this case, the intercepted symbol probably needs to be
> exported, but the collector otherwise has to know a lot of stuff that
> might change with libstdc++ changes.  (Maybe you should try posting  
> your
> patch to the libstdc++.  That might scare them into adding a  
> hook :-) )

Agreed.  This is brittle.  I'll post this to libstdc++'s list.

I'll also post this on my site for others to use, as it is a fairly  
decent stop-gap measure.  I actually don't think it is so bad, mainly  
because we are dealing with functions in the ABI, which is supposed  
to be standardized (there's that multi-vendor C++ ABI thing that GCC  
is trying to comply to).  In particular, __cxa_allocate_exception()  
is something that code generated by GCC must call, and presumably,  
other compilers also have to emit calls to this same function.   
Further, the semantics of this function are likely to be fixed, or  
else interoperability would become problematic.

> Could we get away with instead defining a GC_THROW macro that  
> copies the
> object to a GC-visible location (probably just with a bit-wise  
> copy, to
> make sure it's a shallow copy), before it does the actual throw?

I don't mind using macros provided by the GC to throw exceptions.   
However, there are some nasty problems.  Take a look, maybe you can  
come up with some better ideas that what I've got.

1) GC_THROW will have to bind the exception into a variable, for  
example like this:

template< typename T >
void GC_THROW(const T &exn) {
     // make copy of &exn
     throw exn;
}

In GCC, if the argument to throw is a variable then the object is  
copy constructed into the memory provided by __cxa_allocate_exception 
().  This implies calling the object's copy constructor.  This  
constructor might allocate new memory.  So you run the risk of having  
the GC's copy and the C++ runtime's copy of the exception point at  
different objects.

This'll happen in my code for Vectors.  Copy construction of a Vector  
means allocating a new chunk of memory..  Luckily, I've never written  
an exception class that used Vectors, but that's not to say that I  
won't at some point in the future.

You might be able to solve this with the following hackery:

template< typename T >
void GC_THROW(const T &exn) {
    try {
       throw exn;
    } catch (const T &exn) {
       void *ptr=&exn;
       size_t size=sizeof(T);
       // make copy of ptr
       throw;
    }
}

But I haven't convinced myself yet that this'll work.

2) The exception object might be modified after being thrown in a way  
that GC_THROW cannot track.  Consider the code:

try {
    GC_THROW(new Foo());
} catch (Foo *&x) {
    x=new Foo(); // BUG!
    throw;
}

Now, the problem here is that the statement labeled 'BUG' will modify  
the contents of the exception slot.  The GC's copy won't see this,  
possibly leading to the new new Foo getting reclaimed prematurely.

You might be able to get around this by continuing the hack that I  
started with my GC_THROW().  If GC_THROW() saves the size and  
something akin to the type of the original thrown exception, then you  
could code a GC_RETHROW() routine that updates the GC's copy of the  
exception.  The above code could be rewritten as:

try {
    GC_THROW(new Foo());
} catch (Foo *&x) {
    x=new Foo();
    GC_RETHROW();
}

But this isn't quite enough.  Consider a slight perturbation on the  
above:

try {
    GC_THROW(new Foo());
} catch (Foo *&x) {
    x=new Foo(); // BUG!
    ... // some code that forces GC
    GC_RETHROW();
}

Now, when the GC is forced, the Foo will get reclaimed prematurely.

Of course, we could go back to the GC_THROW() and be really crude, as  
follows:

template< typename T >
void GC_THROW(const T &exn) {
    try {
       throw exn;
    } catch (T &exn) {
       void *ptr=&exn;
       size_t size=sizeof(T);
       GC_add_roots(ptr,ptr+size);
       throw;
    }
}

Yuck!

You might be disturbed to know that the workaround in my current code  
is something very close to this.  My exception classes have to stuff  
all pointers into the GC heap into a 'GCHolder' smart-pointer type  
thing, which uses GC_add_roots() and GC_remove_roots() in a strategic  
way.  It's pretty horrible, but it works.

3) As far as I understand, each thread must keep track of a stack of  
exceptions, and the only general way of knowing when an exception is  
removed from the stack is to intercept calls into the ABI routines.   
See below.

> Would it be much of a problem to restrict clients to one in-flight
> exception at a time, per thread?  I don't remember the detailed C++
> exception rules, but that seems to be the more tractable case anyway.
> And that might make it possible to dodge the issue of when to throw  
> away
> the saved copy, without requiring changes to the handler.

Can't do that.  Consider the code:

try {
   throw 5;
} catch (...) {
    try {
       throw 6;
    } catch (int x) {
       do something with x;
    }
    throw;
}

When you do 'throw 6', the '5' has to be kept around.  The runtime  
has to manage a stack of these things.  The size that this stack  
grows to is only limited by some statically known constant times the  
size of the call stack.

A mechanism for exceptions that doesn't allow for this would  
seriously mess up my code.

One solution to this might be the GC_THROW() template function that  
uses GC_add_roots(), and a baseclass for GC-aware exceptions that  
looks something like this:

struct gc_exception {
     virtual ~gc_exception() {
         GC_remove_roots_that_include(this);
     }
};

Where GC_remove_roots_that_include() does something like removing all  
root spans that include the given address.  The idea here is that  
whenever the exception gets blown away, it notifies the GC.  If that  
exception happens to be in the exception slot, then the GC knows to  
remove those roots.  Probably, instead of using GC_add_roots() in  
GC_THROW(), we'd want to use something specialized, so that  
GC_remove_roots_that_include() would know to only look in that  
limited set.

Then, any C++ exception class would either be a baseclass of  
gc_exception, or would simply include a gc_exception field, as follows:

class MyException {
private:
     ... // stuff
     gc_exception _;

public:
     ... // more stuff
};

> I admit that solution is also ugly, and that it requires client code
> changes.  But the changes are only needed for exceptions that point to
> garbage collected memory.  Since this problem hasn't generated many
> complaints yet, I'm hoping such code is rare?

Well, three things are rare, at least in the C++ code I've written:

1) Dynamic uses of exceptions,

2) Exceptions that point to GC heap, and

3) Calls into the GC while an exception is in flight.

C++ exceptions can be quite slow, so you end up keeping the dynamic  
uses to a minimum.  The vast majority of my code that uses C++  
exceptions does very little while the exception is in flight; in  
fact, most exceptions just propagate out and cause the program to  
terminate or display a message.

In particular, a program that stuffed pointers to the GC heap into an  
exception would always run correctly provided that: (i) the code  
triggered during the throw didn't allocate any memory via the GC, and  
(ii) no other thread called into the GC while the exception was in  
flight.  This is likely to be a very large set of programs.

There might even be programs out there that have exceptions that  
point to the GC heap, and that allocate memory in catch blocks, but  
because exceptions are thrown rarely enough, you never see problems.   
This was the case in my code for some time.

Filip



>
> Hans
>
>> -----Original Message-----
>> From: gc-bounces at napali.hpl.hp.com
>> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Filip Pizlo
>> Sent: Sunday, January 22, 2006 4:34 PM
>> To: gc at napali.hpl.hp.com
>> Subject: [Gc] workaround for C++ exceptions problem
>>
>>
>> Hello,
>>
>> Even if our request to add a hook to libstdc++ is approved, the
>> interaction between exceptions and GC will still be problematic on
>> older versions of gcc.  For this reason, I decided to take a
>> crack at
>> developing a small hack that overrides the routines that
>> allocate C++
>> exception storage.  I include a tarball of what I've done.
>>
>> The idea is to override the two routines from eh_alloc.cc and expose
>> two function pointers that can be set to point at
>> GC_malloc_uncollectable/GC_free.
>>
>> I tested this on my Mac with two different compilers (Apple GCC 3.3
>> and 4.0).
>>
>> Most of the action here happens in configure.ac.  I've put comments
>> in this file describing how it detects the relevant information.  It
>> isn't pretty, but it should just work.
>>
>> The idea would be to make this part of the GC.  That is, the
>> relevant
>> bits of autoconf code could be integrated right into the GC's
>> autoconf script, and activated when the user asks for C++ support.
>> The exoverride.cpp file could easily be made part of the GC sources.
>>
>> -Filip
>>
>>
>>
>
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/



More information about the Gc mailing list