Re[3]: [Gc] Race condition in garbage collector

Ivan Maidanski ivmai at mail.ru
Tue Aug 21 11:31:10 PDT 2012


Hi Juan,

Do you have any progress on this?

Regards,
Ivan

Sun, 05 Aug 2012 12:18:33 +0400 Ivan Maidanski <ivmai at mail.ru>:
>	
>
>
	
	
>
		
		
			
>
Hi Juan,
>
>It looks like to be hard to avoid locking in GC_dyld_image_add/remove, so let's try to propose some other workaround for your exact case. (As you say wrapping dlopen/close is not working due to non-recursive locks neither.)
>My idea is to unlock for a while somewhere in GC_inner_start_routine to prevent deadlocking. But where? Since you say it happens on thread exit, i.e. in GC_thread_exit_proc (acquiring the lock) called from pthread cleanup.
>
>As I understand, DISABLE/RESTORE_CANCEL and GC_remove_specific are both no-op on your target, right?
>Do you have GC_incremental=0? (In this case GC_wait_for_gc_completion does almost nothing.)
>GC_unregister_my_thread_inner seems to be inlined as it's not shown in backtrace.
>The thread (1) has DETACHED bit set, right?
>If everything is as I assume then there are only 3 calls of interest:
>- pthread_self
>- mach_port_deallocate (and mach_task_self),
>- GC_INTERNAL_FREE.
>
>Could you please temporarily comment out these mach_port_deallocate (and mach_task_self), GC_INTERNAL_FREE calls and retry?
>
>Regards,
>Ivan
>
>Sun, 22 Jul 2012 21:20:48 +0200 Juan Jose Garcia-Ripoll <juanjose.garciaripoll at gmail.com>:
>
>>
>>
>>
>>
>>On Sun, Jul 22, 2012 at 4:43 PM, Juan Jose Garcia-Ripoll <juanjose.garciaripoll at gmail.com> wrote:
>> 
>>>1) This thread is a servicing one. It is trying to exit and in the process it acquires the GC lock, but for some reason the thread invokes the dyld library. I still haven't located where in GC this happens but from the symptoms it seems it is close to GC_unregister...[...]
>>>
>>>2) This thread is the main one. It is trying to close a bunch of libraries, none of which are related to the thread above. However, when dlclose() is called, some code associated to the garbage collector is run and we enter a race condition.
>>It is very difficult to prevent 1) from happening, because the call to dyld happens inside the garbage collector exit code, or somewhere in pthread's library, I do not know.
>>
>>
>>I have tried wrapping dlopen() and dlclose() with GC_call_with_alloc_lock(). The problem here is that the garbage collector uses default mutexes and they are not recursive in OS X. The result is a deadlock.
>>
>>
>>I would appreciate some solution.
>
>
>
>(gdb) thread 2
>
>(gdb) bt
>#0  0x00007fff88009bf2 in __psynch_mutexwait ()
>#1  0x00007fff897d31a1 in pthread_mutex_lock ()
>#2  0x00007fff84eae623 in dyldGlobalLockAcquire ()
>#3  0x00007fff6172a745 in __dyld__ZN26ImageLoaderMachOCompressed20doBindFastLazySymbolEjRKN11ImageLoader11Link\
>ContextEPFvvES5_ ()
>#4  0x00007fff61717922 in __dyld__ZN4dyld18fastBindLazySymbolEPP11ImageLoaderm ()
>#5  0x00007fff84eae716 in dyld_stub_binder_ ()
>#6  0x0000000101d01458 in C.88.15036 ()
>#7  0x0000000101c73100 in GC_inner_start_routine (sb=0x1041deeb0, arg=0x102117ea0) at pthread_start.c:67
>#8  0x0000000101c6eb1c in GC_call_with_stack_base (fn=0x101c73030 <GC_inner_start_routine>, arg=0x102117ea0) a\
>t misc.c:1510
>#9  0x0000000101c74565 in GC_start_routine (arg=0x102117ea0) at pthread_support.c:1504
>#10 0x00007fff897d48bf in _pthread_start ()
>#11 0x00007fff897d7b75 in thread_start ()
>
>(gdb) thread 1
>[Switching to thread 1 (process 37491), "com.apple.main-thread"]
>0x00007fff88009bf2 in __psynch_mutexwait ()
>(gdb) bt
>#0  0x00007fff88009bf2 in __psynch_mutexwait ()
>#1  0x00007fff897d31a1 in pthread_mutex_lock ()
>#2  0x0000000101c74833 in GC_lock () at pthread_support.c:1784
>#3  0x0000000101c6c53d in GC_remove_roots (b=0x104f03220, e=0x104f03238) at mark_rts.c:311
>#4  0x0000000101c61f20 in GC_dyld_image_remove (hdr=0x104eff000, slide=4377800704) at dyn_load.c:1319
>#5  0x00007fff61714bdd in __dyld__ZN4dyld11removeImageEP11ImageLoader ()
>#6  0x00007fff6171858d in __dyld__ZN4dyld20garbageCollectImagesEv ()
>#7  0x00007fff6171c432 in __dyld_dlclose ()
>#8  0x00007fff84eaebd5 in dlclose ()
>#9  0x0000000101c2ae8c in dlclose_wrapper [inlined] () at /Users/jjgarcia/devel/ecl/src/c/ffi/libraries.d:432
>#10 0x0000000101c2ae8c in ecl_library_close (block=0x103be4e00) at libraries.d:432
>#11 0x0000000101c2af79 in ecl_library_close_all () at libraries.d:448
>#12 0x0000000101b1a84d in cl_shutdown () at main.d:301
>#13 0x0000000101b1a964 in si_exit (narg=4377800704) at main.d:839
>#14 0x0000000101b13e47 in main ()
>
>>
>>
>>
>>
>>
>>
>>Juanjo
>>
>>
>>--
>>
			
		
		
	

	
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://napali.hpl.hp.com/pipermail/gc/attachments/20120821/08e275c5/attachment.htm


More information about the Gc mailing list