[Gc] weak maps and libgc

Petter Urkedal urkedal at nbi.dk
Tue Oct 25 10:18:16 PDT 2011


On 2011-10-21, Andy Wingo wrote:
> Hello,
> 
> I and a few others have written in the past about the difficulties of
> implementing weak maps with libgc.  Properly implementing weak maps
> requires GC integration.  Currently I do terrible things with the pre-GC
> hook marking a flag that causes a vacuum procedure to run after GC.  I
> had not thought about the problem of cycles from values to keys in
> key-weak maps though:
> 
>   http://www.jucs.org/jucs_14_21/eliminating_cycles_in_weak/jucs_14_21_3481_3497_barros.pdf
> 
> To implement this sort of thing, you need tighter GC integration.  But
> providing those hooks limits the GC.  To avoid publically exposing those
> hooks, it would be possible to incorporate a weak map or weak set
> implementation into libgc itself.  What do you think?

Ivan recently accepted a patch I've been using for some time to
implement hash-consing efficiently.  The API is found in "gc_disclaim.h".  I
intended it to be general enough to implement other weak data structures, as
well.

The idea is to register a call-back which will be invoked by the
collector before it reclaims an object, allowing it to remove the object
from any data structures where it occurs weakly.  The call-back may also tell
the collector not to reclaim the object.  This is convenient to avoid
deadlocks (see below), or to inform the collector that new strong
references may have been created since the last collection.

Here is a sketch of how to use the API:

Create a new object kind and GC_register_disclaim_proc with the
call-back.  You may want to pass 1 as the last argument (mark_from_all)
to protect any pointers contained in the weak key:

int weakcoll_init(void)
{
    weakcoll_kind = GC_new_kind(...);
    GC_register_disclaim_proc(weakcoll_kind, weakcoll_disclaim_proc, NULL, 1);
}

You will typically use the first word of the key to store a reference to
the container from which it shall be removed.  Due to a technical issue,
the disclaim call-back will currently also be called on freed objects,
in which case a free-link occurs in the first word.  The trick is to use
the lower bit of the first word to indicate that the object is live:

int weakcoll_diclaim_proc(void *key, *cd)
{
    if (!(*(uintptr_t *)key & 1))
	return 0;		/* Skip object in free-list. */

    weakcoll_t c = (weakcoll_t)(*(uintptr_t *)key & ~(uintptr_t)1);
    if (!weakcoll_try_lock(c))	/* Avoid dead-lock! */
	return 1;		/* Retry on next cycle. */

    weakcoll_locked_remove(c, key);
    weakcoll_unlock(c);
    return 0;			/* Successfully released. */
}

For a complete example, I have a hash-consing implementation in
https://github.com/paurkedal/culibs/blob/master/cuoo/halloc_disclaim.c
though it's not very pedagogical, as its cluttered with optimisations, and
somewhat specialized.


More information about the Gc mailing list