[Gc] Roots observed in GC stack with threads
hans.boehm at hp.com
Fri Nov 30 17:24:43 PST 2007
> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Lincoln Quirk
> Sent: Tuesday, November 27, 2007 1:22 PM
> To: skaller
> Cc: gc at napali.hpl.hp.com
> Subject: Re: [Gc] Roots observed in GC stack with threads
> On Wed, Nov 28, 2007 at 05:56:01AM +1100, skaller wrote:
> > On Mon, 2007-11-26 at 21:48 -0500, Lincoln Quirk wrote:
> > > The program is in a complex infinite loop which results
> in several
> > > allocations each iteration. The essence of it is that it's
> > > constructing a new node in a linked list on each
> iteration, but the
> > > head is being advanced as fast as the tail. I (fairly strongly)
> > > believe I'm updating any pointers to the old head, so
> that the old
> > > elements in the list should not be retained and the
> collector should
> > > properly collect it as execution continues.
> > Be good if you could give the actual data structure? You don't say
> > whether your linked list is doubly or singly linked.
> > If it is doubly linked you probably forgot to NULL out the end note
> > 'next' pointer, leaving the tail still reachable.
> It's singly linked. The reason I didn't describe it in more
> depth is because it's a significantly more complex data
> structure than a simple singly linked list.
> My question is more about the particular root that the
> backtrace identified. Why is this address (that is inside the
> GC's thread-initialization stack frames) being considered a root?
The collector usually traces from essentially the entire stack. The start-up code either uses the address of a local in a very early frame, or the base of the mapping, as the stack base. Both are a bit too conservative here. Being more careful here would probably make the code significantly uglier. And I suspect your case is either an anomaly, or it could be fixed by clearing a particular location at the right time. That would involve a bit more debuggint to try to understand exactly where that value came from.
Fixing this sort of thing generally isn't a big help. Petter's suggestion of clearing next pointers in this particular case is likely to be far more robust.
There have been attempts at trying to parse the stack. The OSX code still does that by default, I think. But past experience has generally been that it's not 100% solid for all applications. There will sometimes be an assembly frame somewhere that doesn't quite follow the calling conventions sufficiently.
This particular problem, and ways to identify it, are discussed in some detail in
Boehm, "Bounding Space Usage of Conservative garbage Collectors"
There is code in the collector (-DMAKE_BACK_GRAPH) to detect this sort of problem even if there is no stray pointer in the address space that actually causes extra retention.
More information about the Gc