[Gc] Corrupting internal data structures

Bruce Hoult bruce at hoult.org
Sat Jun 4 21:54:53 PDT 2011

On Mon, May 30, 2011 at 4:56 AM, David Chisnall
<theraven at theravensnest.org> wrote:
> 1) Is it possible to tell the collector to allocate memory for internal use from a separate block?  This would make it much easier to track down errors in the client code.

That would be a really bad idea for production use. In many programs
the average object size is quite small, so adding another 4 or 8 bytes
per object somewhere else could be major overhead. It would also
potentially add a lot more pagefaults or TLB/cache misses.

Note that objects are usually only on a free list for a fairly short
time, as only 1 memory page at a time of objects are put on the free
list for a given object size and (non)atomic/(un)collectable

I suppose something like this *could* be added as a debugging aid, but
it would be new code, not just flicking a compile switch.

> 2) Is it possible for the collector to refrain from reallocating memory that has been recently freed for a while?  This would allow it to be left filled with some known value to make sure that nothing is modifying it.

It would be possible to add an option to fill unused objects with a
pattern immediately after the mark phase, but there is no such option
at present.

The collector goes to some trouble to avoid touching pages of dead
objects at ALL, until right at the moment that the first object in
that page is about to be allocated again. It is only at that point
that the objects in that page that have cleared mark bits are added to
the free list by chaining them together via pointers stored in the
first 4 or 8 bytes of each object.

Only one page worth of objects is every on a free list at one time.
(except if you use GC_free – see below)

Filling untraced objects with a pattern would, once again, cause a lot
of cache and potentially VM paging activity that does not happen at
the moment.

This care to not touch pages of dead objects is, in my opinion, one of
the big reasons that the collector performs well against other GCs and
malloc's, even those with supposedly much more sophisticated

Note that if you call GC_free() explicitly then that object is
immediately added to the start of the relevant free list and will be
THE NEXT object of that size to be allocated.

Use-after-explicit-free errors therefore corrupt the GC's data
structures very quickly.

More information about the Gc mailing list