[PATCH] Re: Found? Re: [Gc] Assertion failure in GC_malloc_uncollectabl

Jan Wielemaker J.Wielemaker at vu.nl
Tue May 15 01:53:20 PDT 2012


Hi Ivan,

On 05/15/2012 10:35 AM, Ivan Maidanski wrote:
> Hi Jan Hans,
>
> Jan -
>
> I'd like to propose an alternative solution:
>
>          GC_ASSERT(hhdr ->  hb_n_marks == 0);
>          hhdr ->  hb_n_marks = 1;
>
> ->
>
> #       ifdef THREADS
>            hhdr->hb_n_marks++; /* It might be already updated concurrently. */'

Who am I to think you are wrong?  Still, I'll try :-)  I think this is 
wrong because another thread has done the mark between 
GC_malloc_generic() released the lock and GC_malloc_uncollectable() 
re-acquired it.  So, hhdr->hb_n_marks = 1 when we get here.  Surely,
a big block always has hhdr->hb_n_marks being either 0 or 1.  You
make it 2 if I read this correctly.

My motivation to move this into GC_malloc_generic() was twofold: clearly
GC can come in between and it seems a waste to acquire the lock twice 
for a single operation with only a few unlocked instructions in between.

With the proposed patch, my program has run its stress test with 8 
threads overnight without problems.  I think it would be enough to
simply drop the GC_ASSERT() if my assessment is correct that it is
not possible that the object gets collected by the GC doing its work
between the two locks.

Anyway, I don't care much as long as the crash is gone :-)

	Regards --- Jan


> #       else
>            GC_ASSERT(hhdr ->  hb_n_marks == 0);
>            hhdr ->  hb_n_marks = 1;
> #       endif
>
> Let me know if I'm wrong.
>
> Hans -
> What's your opinion?
>
> Regards.
>
> Mon, 14 May 2012 15:55:23 +0200 Jan Wielemaker<J.Wielemaker at vu.nl>:
>> Hi,
>>
>> Attached patch changes changes my program from crashing within a minute
>> to already running for an hour.
>>
>> Patch is relative to git fetched a few hours ago.  See below for details
>> on the rationale.
>>
>> Regards --- Jan
>>
>>
>> On 05/14/2012 02:34 PM, Jan Wielemaker wrote:
>>> Hi,
>>>
>>> The thing below is still bothering me, but I think I tracked it down to
>>> a wrong GC_ASSERT() in GC_malloc_uncollectable(). For large objects, we
>>> have this code:
>>>
>>> op = (ptr_t)GC_generic_malloc((word)lb, UNCOLLECTABLE);
>>> if (0 == op) return(0);
>>>
>>> GC_ASSERT(((word)op&  (HBLKSIZE - 1)) == 0); /* large block */
>>> hhdr = HDR(op);
>>> /* We don't need the lock here, since we have an undisguised */
>>> /* pointer. We do need to hold the lock while we adjust */
>>> /* mark bits. */
>>> LOCK();
>>> GC_ASSERT(hhdr ->  hb_n_marks == 0);
>>>
>>> Where the last GC_ASSERT triggers for me. I consistently find that
>>> hb_last_reclaimed is one less than GC_gc_no. I added the same assertion
>>> to GC_generic_malloc() before the GC lock was released. In that case, it
>>> is still the assert above that triggers. This can be explained:
>>> GC_generic_malloc() releases the lock, a GC marks the object and the
>>> assertion in GC_malloc_uncollectable() fails. Right?
>>>
>>> Now, I think that the assertion is simply wrong and there is no need
>>> to change anything else to the code, but I would be grateful if someone
>>> with in-depth knowledge can confirm this.
>>>
>>> Cheers --- Jan
>>>
>>> P.s. The crash happens consistently(?) while the block is allocated
>>> from the final for-loop in GC_allochblk() (i.e., allocating
>>> from a bigger free list and splitting. This might suggest
>>> there is more to it that just removing the GC_ASSERT().
>>>
>>> On 05/02/2012 11:36 AM, Jan Wielemaker wrote:
>>>> Hi,
>>>>
>>>> I'm experiencing assertion failures with GC_malloc_uncollectable() on a
>>>> big block (usually a few 100 Kb): GC_ASSERT(hhdr ->  hb_n_marks == 0).
>>>>
>>>> Now, I have to admit this is on a modified version (as discussed on this
>>>> list). The version is based on a1467f2140f9db3408f6214137c7dc31f10d0bd9
>>>> (Jan 16 git version). Using the unmodified version would require many
>>>> changes to the program :-(
>>>>
>>>> This is merely a quick investigation to see whether this rings a bell
>>>> with someone. Platform is X64 Linux (Ubuntu). GC Config flags:
>>>> '--enable-threads=pthreads' '--enable-parallel-mark'
>>>> '--enable-large-config' '--enable-gc-assertions'
>>>>
>>>> hhdr is this:
>>>>
>>>> (gdb) p *hhdr
>>>> $1 = {hb_next = 0x0, hb_prev = 0x0, hb_block = 0x23ba000,
>>>> hb_obj_kind = 2 '\002', hb_flags = 0 '\000', hb_last_reclaimed = 7,
>>>> hb_sz = 196624, hb_descr = 196624, hb_large_block = 1 '\001',
>>>> hb_map = 0x638f60, hb_n_marks = 1, hb_n_uncollectable = 0,
>>>> _mark_byte_union = {_hb_marks = "\001", '\000'<repeats 255 times>,
>>>> "\001",
>>>> dummy = 1}}
>>>>
>>>> Only one thread is doing work at the moment of the crash; several others
>>>> block waiting on the allocation lock. As I read it, hhdr suggests that
>>>> the block is marked and in use, so it shouldn't be handed out by a new
>>>> allocation call, right?
>>>>
>>>> The program does lots of complicated stuff using the garbage collector,
>>>> but the crash is always in the big GC_malloc_uncollectable() called from
>>>> the same context. These allocations manage a big buffer that is resized
>>>> (doubled) using a new GC_malloc_uncollectable(), copy the data and call
>>>> GC_free() on the old data. Several threads play the same game, on their
>>>> own private datastructures. Of course, this means that an old buffer
>>>> released using GC_free() in one thread is typically a good candidate for
>>>> another thread. The crash is not very frequent: I need to make the
>>>> program go through the critical process several hundred times to get
>>>> a crash.
>>>>
>>>> Thanks for any hint
>>>>
>>>> --- Jan
>>>> _______________________________________________
>>>> Gc mailing list
>>>> Gc at linux.hpl.hp.com
>>>> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>>>
>>> _______________________________________________
>>> Gc mailing list
>>> Gc at linux.hpl.hp.com
>>> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
>>
>>
>> _______________________________________________
>> Gc mailing list
>> Gc at linux.hpl.hp.com
>> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/



More information about the Gc mailing list