[Gc] Cheaper finalisation and weak pointers

Boehm, Hans hans.boehm at hp.com
Wed Feb 9 18:00:51 PST 2005

There seem to be a few applications like yours where performance
of a finalization-like mechanism matters.

A finalization plug-in interface generally sounds like a good idea,
since I
suspect there are many slightly different useful variants.

I'd like to understand a bit better where exactly the trade-offs
are here.

>     The solutions using the present GC functionality gives the
>     following overhead, not counting asymptotically negligible parts:
> 	disappearing links:		5 words extra, 3 fragments
> 	    hash-table links must be moved into separate memory
> 	    fragments (+ 1 word) and disappearing links cost 4 words.
Could you explain why hash-table links have to be moved to separate
(I may misunderstand what that means.  I count 4 words in the GC's
data structures, with one of those in the hash table array.)

It seems to me that you potentially win for a couple of different,
separable, reasons:

1) You avoid the GC's internal hash table, since you don't allow
to be unregistered.

2) Since the linked list is in the object itself, you don't need a
pointer to the object being tracked.

Does that sound right?

Have you thought about the concurrency issues here?  Normally you need
lock on a "disappearing" pointer dereference, to deal with the case of
the GC
clearing the reference while you only have the hidden pointer in a
What locking discipline do you need?

> 	finalisers:			7 words extra, 2 fragments
> 	    hash-table links can remain in expressions, but finalisers
> 	    cost 7 words.
>     The alternatives I suggest gives:
> 	weak pointers:			2 words extra, 2 fragments
> 	    hash-table links must be moved into separate fragments
> 	    and the weak pointer requires an additional link
> 	finalisation queues:		1 word extra, 1 fragment
> Usage.  Dynamically Typed Languages
>     I suspect that the finalisation queues described below could be
>     used by dynamic languages to reduce the overhead for finalisable
>     objects.  Since the objects typically contains a type or a link to
>     a virtual function table, they can process finalisation queues and
>     dispatch to the correct finalisers.  This may not be that
>     important, though, since finalisers are generally used in
>     scattered cases to release non-memory resources.
> Solution.  Weak Pointers
>     The interface may look like:
> 	typedef struct GC_weakptr_node GC_weakptr;
> 	typedef struct { /* private fields */
> 	    GC_word next; /* hidden */
> 	    GC_word hptr; /* hidden */
> 	};
> 	void GC_weakptr_init(GC_weakptr *, void *ptr);
> 	void GC_weakptr_set(GC_weakptr *, void *ptr);
> 	void *GC_weakptr_get(GC_weakptr *);
>     GC_weakptr_init clears ptr and links GC_weakptr_node into a global
>     or thread-local singly linked list.  It will only be unlinked when
>     the object containing it is ready to be reclaimed.  The other two
>     functions are trivial macros or inliners, but provided to make it
>     clear that the fields of GC_weakptr_node are private.
>     After a global mark, the collector follows all weak pointer nodes,
>     unregisters those which are placed in unmarked objects and for the
>     rest, clears hptr if it points to an unmarked object.
>     I think this solution is easy to use and has clean semantics, as
>     well as being easy to implement in the current collector code.  It
>     does not support containers with collectable nodes, so that means
>     an extra fragment in cases where the keys can be inlined into the
>     container's nodes.
> Solution.  Finalisation Queues
>     The interface (fiq = finalisation queueing):
> 	typedef struct GC_fiq_queue GC_fiq_queue;
> 	void GC_fiq_init(GC_fiq_queue *fq);
> 	void *GC_fiq_malloc(GC_fiq_queue *q, size_t size);
> 	void *GC_fiq_malloc_atomic(GC_fiq_queue *q, size_t size);
> 	void GC_fiq_is_dead(void *obj_fiq_base);
> 	int GC_fiq_has_dead(struct GC_fiq_queue *q);
> 	void *GC_fiq_pop_dead(GC_fiq_queue *q);
>     The data structures may look like:
> 	struct GC_fiq_queue {
> 	    GC_word next_in_gc;	/* GC's link of queues to process */
> 	    GC_word live;	/* stack of GC_fiq_node (hidden 
> links) */
> 	    GC_word dead;	/* queue of GC_fiq_node (not hidden) */
> 	    GC_word *dead_end;	/* for appending to queue */
> 	};
> 	struct GC_fiq_node {
> 	    GC_word next_in_fiq; /* next live (hidden) or dead */
> 	    /* real object pointer starts here */
> 	};
>     In addition to the above minimal interface
> 	void GC_fiq_free_dead(void *obj_fiq_base);
> 	void GC_fiq_resurrect_dead(GC_fiq_queue *fq, void *ptr);
>     GC_fiq_free_dead would often be safe to call after finalisation of
>     a node since the nodes are typically internal to the container
>     which owns the queue, and thus should have no further finalisers.
>     With GC_fiq_resurrect_dead, the container can reuse nodes when
>     feasible.
> The Patch
>     The attached patch is for gc6.4.  I have tried to minimise changes
>     to the existing GC code by adding a "plugin" facility,
> 	GC_register_plugin(GC_plugin_inspector_proc ip,
> 			   GC_plugin_finalization_proc fp, void *cd);
>     declared in gc_plugin.h.  The "inspector" is called in a
>     GC_finalize to check the marks, which means that it works in a
>     rather restricted environment.
>     The real additions are in
> 	include/gc_weakptr.h  weakptr.c		Weak pointers
> 	include/gc_fiq.h      fiq.c		Finalisation queues
>     I added two test cases in the contrib subdir.  It includes a weak
>     pointer hash map (GC_wpmap) which was useful for testing weak
>     pointers, though it may not be a good idea to ship too much
>     unrelated things with the GC, so I can write a more direct
>     test-case.
> Questions
>     Shall I keep the "plugin" facility?  If so, is it better to
>     register inspectors and finalisers separately?  I.e.:
> 	GC_register_inspection_plugin(GC_inspection_plugin_proc);
> 	GC_register_finalization_plugin(GC_finalization_plugin_proc);
>     As things are now, inspection plugins are only useful when defined
>     in the GC source tree, since it will need access to private
>     functions to do anything useful.  That ok for me if my other
>     additions are accepted.  However, exporting the necessary
>     low-level functions with appropriate warnings about their use
>     could make it easier for people to test extensions without
>     patching the GC.  Good/bad?
>     Ad Goal 2: The comment above GC_finalize says it runs with the
>     world stopped.  Is this still the case?  Is it possible to either
>     run client code in parallel using thread-local alloc, or to spawn
>     additional threads (= number of CPUs) for the inspection phase?
>     Any preference of interface changes or renaming?
>     Are there still platforms which the collector supports which
>     require K&R prototype?  (I have some fixing to do in that case.)
>     How shall the license blurb at the top look like; shall I keep
>     only the last (C) from the other files, or duplicate all?
> I can maintain the patch separately for some time until a 
> suitable point of inclusion.  At the moment, I just want some 
> feedback.
> Sincerely,
> Petter Urkedal

More information about the Gc mailing list