[Gc] workaround for C++ exceptions problem

skaller skaller at users.sourceforge.net
Wed Jan 25 19:01:04 PST 2006

On Wed, 2006-01-25 at 16:20 -0500, Filip Pizlo wrote:

The point of what I'm saying is that whoever tries to modify
the GC/C++runtime to work with C++ exceptions in g++, must find out
what happens to 'the' exception when another one is legally raised,
and, in what way 'the' exception is restored when the current
one is no longer required.

This can happen in two completely distinct circumstances:

(a) the exception is caught, but in the catch block,
another exception is thrown. 

(b) the exception is NOT caught, but in some
destructor another exception is thrown.

in BOTH cases, there is a new 'top' exception which
must be stored in the 'top' exception slot, and in
both cases the old exception in that slot must be
saved on some kind of stack so it can be restored
when and if the new exception is handled.

In other words, there must be a stack of currently
in flight exceptions, by BOTH our definitions.

You know where the top of that stack is.
Where is the rest of it??

Are the exceptions pushed onto the machine stack?
Or are they chained together in a linked list?

Both implementations are viable. In both cases
the GC MIGHT be able to keep track of everything
just knowing the top slot, assuming it is in a fixed
place (which is stupid since it isn't thread safe unless
that place is in TLS)

Filip proposes to actually force the
exception allocation and deallocation to go thru
calls to the gc, and mark the objects as fresh roots
on allocation, and unroot them on deallocation
so they become collectable.

Hans proposal is simpler: to mark the region containing
the pointer to the object as a root, and allocate
the exception objects as non-roots.

BOTH proposals clearly require hooking the allocation
and deallocation functions used.

Hans proposal is sensitive to knowing how exceptions
are stacked, but insensitive to knowing anything
else about the C++ internals.

Filip's proposal doesn't care how the exceptions
are stacked, and doesn't need to know about the
region containing the top exception slot,
so it is more robust in that sense, however it
relies on the in flight exceptions being allocated
and disposed to mark out their lifetimes in
a way that is compatible with reachability.

Which is better is related to what you expect from
the gcc implementation .. which may be processor
dependent for all I know. The ABI document I have
describes the X86_64 ABI.

The important point here is to dispose of the misconception
that there is only one exception in flight at once, and 
thus necessarily only one pointer to keep track of.

The 'current' exception slot is only the top of a stack
of 'in flight' exceptions. When an exception is thrown,
the current top may have to be saved. Where is it saved
when this is required?

If it is 'saved on demand' the machine stack obviously
cannot be used because throwing the new exception is
going to start unwinding it. A heaped linked list
is indicated here.

If it is 'saved' and 'restored' whenever unwinding
recursively enters 'normal' mode -- which occurs
when either 

(a) a catch block is entered or
(b) a destructor is executed 

then the machine stack is the obvious stack to use.

The point is I think Filip's technique of tracking
the exception allocation and deallocations, and marking
the allocated exceptions roots, and unmarking or actually
deallocating in the deallocation routine, doesn't
need to know about which method is used. It just
works provided the ABI allocates and deallocates
the exceptions 'properly'.

Hans proposal on the other hand just allocates
and deallocates the exceptions as ordinary blocks,
and marks the region containing the top of stack
as a scannable but not collectable root.

Hans technique may leak, if gcc leaves garbage
in that slot. It will also fail catastrophically
if it doesn't account for the particular way the
ABI is managing the exception stack: it isn't
enough to track the top exception, the whole
stack of them must be traced.

If the machine stack is used for the exception
stack, tracing the top exception is enough, since
the gc already traces the machine stack.

If a linked list of heaped nodes are used,
then it isn't the top exception that needs to
be tracked -- its the top NODE of that list,
which could be represented two ways:

(a) a single pointer to the heap node in the slot
(b) two slots, one for the current exception, 
and other to the rest of the list on the heap

So Hans proposal is quite sensitive to implementation
details, whereas Filip's is sensitive to ISO C++

Thus, real user code may work with Han's proposal,
where the address of the 'current' exception is stored
somewhere and remains reachable even when C++ says a
pointer to it would be invalid.

Filip's proposal may allow this too, provided
the deallocation technique is merely to unmark
the exception as a root, and not to actually
delete it.

Note again my analysis is based on what the C++ implementation
is REQUIRED to do. What gcc does on a particular processor
for a particular OS is another thing, which can be established
by consulting the docs and/or examining the code and/or testing.

My main point, I think, is that Filip's technique, whilst probably
less efficient than Hans, is independent of the rest of these
details -- it relies only on correct C++ implementation.

Also it won't leak if the implementation deletes the current
exception but does NOT clear the pointer to it, whereas
with Hans technique it will.

Hans proposal requires knowing much more about the implementation.
BOTH proposal require hooking the allocations and deallocations.
So I'd vote for Filip's because it seems more robust -- but
of course I could easily be wrong, I often am, and there is
nothing like actually *knowing* what gcc does .. some care though
since it may be processor and OS dependent: gcc runs on many
more OS than just Linux, and there are more processors around
than the x86 -- I run x86_64 for example, and there is an ABI
for each processor for each OS .. and of course they do change.

The Linux C++ ABI changed recently, it created months of work
for Linux distro maintainers such as Ubuntu and Debian.

The current gcc implementation of catch sucks, it doesn't
work properly across DLL boundaries -- so the ABI may 
well change again  :)

John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: https://felix.sf.net

More information about the Gc mailing list