[Gc] Corrupting internal data structures
theraven at theravensnest.org
Tue Jun 7 05:18:51 PDT 2011
On 5 Jun 2011, at 05:54, Bruce Hoult wrote:
> On Mon, May 30, 2011 at 4:56 AM, David Chisnall
> <theraven at theravensnest.org> wrote:
>> 1) Is it possible to tell the collector to allocate memory for internal use from a separate block? This would make it much easier to track down errors in the client code.
> That would be a really bad idea for production use. In many programs
> the average object size is quite small, so adding another 4 or 8 bytes
> per object somewhere else could be major overhead. It would also
> potentially add a lot more pagefaults or TLB/cache misses.
I'm aware of this. I don't run my non-GC code under Valgrind for production either, but it's useful for debugging.
> Note that objects are usually only on a free list for a fairly short
> time, as only 1 memory page at a time of objects are put on the free
> list for a given object size and (non)atomic/(un)collectable
> I suppose something like this *could* be added as a debugging aid, but
> it would be new code, not just flicking a compile switch.
It would definitely be useful.
>> 2) Is it possible for the collector to refrain from reallocating memory that has been recently freed for a while? This would allow it to be left filled with some known value to make sure that nothing is modifying it.
> It would be possible to add an option to fill unused objects with a
> pattern immediately after the mark phase, but there is no such option
> at present.
Filling it is not the problem. I've added debugging code that fills freed memory with a known-pattern during finalization. The problem is that it's immediately reallocated to something else, so by the time the use-after-free occurs, the object is already being used for something else and no longer has the invalid pattern. It therefore doesn't crash on access, it just write some thing over the memory, which then causes the crash when the bit of code that's subsequently had the memory allocated to it tries to interpret it as a pointer.
> The collector goes to some trouble to avoid touching pages of dead
> objects at ALL, until right at the moment that the first object in
> that page is about to be allocated again. It is only at that point
> that the objects in that page that have cleared mark bits are added to
> the free list by chaining them together via pointers stored in the
> first 4 or 8 bytes of each object.
That's great for deployment, but it makes debugging very difficult.
> Only one page worth of objects is every on a free list at one time.
> (except if you use GC_free – see below)
> Filling untraced objects with a pattern would, once again, cause a lot
> of cache and potentially VM paging activity that does not happen at
> the moment.
> This care to not touch pages of dead objects is, in my opinion, one of
> the big reasons that the collector performs well against other GCs and
> malloc's, even those with supposedly much more sophisticated
Totally irrelevant for debugging. Almost all debugging aides come with some run-time overhead. Something like Valgrind or deubgger watchpoints can have a 90%+ performance hit. That doesn't mean that they're not useful.
> Note that if you call GC_free() explicitly then that object is
> immediately added to the start of the relevant free list and will be
> THE NEXT object of that size to be allocated.
I'm not using GC_free()
> Use-after-explicit-free errors therefore corrupt the GC's data
> structures very quickly.
Yes, this is the problem that I am encountering: Trying to work out which objects are only referenced from GC-invisible memory. A crash somewhere in the mark code is not very informative.
More information about the Gc