[Gc] heap walk function

Boehm, Hans hans.boehm at hp.com
Mon Jan 24 18:05:20 PST 2005


Paolo -

I think it's fine to officially add this mechanism to the collector.
Thanks for working out the patch.

I do have some concerns about the details:

1) Backgraph.c already has GC_apply_to_each_object().  It would be nice
if this used a similar naming convention
(e.g. GC_locked_apply_to_each_live_object), and they both
lived in the same place.  (Backgraph.c is almost certainly the wrong
place.)

2) Can we arrange to do all this with the world running, but the GC lock
still held?  That would allow a call to GC_gcollect_inner(), I think,
and avoid lots of code duplication.  It would also make the logic
much more transparent (it's just another GC), and more likely to be
correct.  With the GC lock held, objects won't disappear out from
under you.  The objects you're interested in presumably shouldn't
change either.  (Thread local allocation might continue, though.)
Under GC6.x code running with the world stopped
also has some very weird restrictions under Solaris, since the system
thread library is temporarily broken in that state.

3) As it stands, I'm concerned about correctness, especially with
incremental GC, you're recomputing the mark bits in the middle of a GC
cycle, while pages are still enqueued for sweeping, etc.  I'm not 100%
convinced it's wrong, but we're abusing the code in ways it wasn't
designed to work.

4) There's also code in dbg_mlc.c that does something very similar
in GC_check_heap_proc.  Ideally we should avoid that duplication as
well.  (That's not critical; I can do it in the GC7 tree after the fact.)

Hans

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com]On Behalf Of Paolo Molaro
> Sent: Tuesday, January 18, 2005 7:56 AM
> To: Gc at napali.hpl.hp.com
> Subject: [Gc] heap walk function
> 
> 
> Hello.
> A small recap of why we need this feature.
> Mono supports the concept of 'application domains': types
> are loaded in an application domain, we create objects that belong
> to the domain, execute methods inside it etc. Think of it as
> what a process is to an operating system: a self-contained
> environment. Just as a process can be removed from memory when
> it finishes, mono needs to be able to terminate an application domain
> and free all the resources it uses: this includes, specifically,
> the object vtables and the GC descriptor that is stored in them.
> The vtable is the first word in the area allocated for an object.
> Now, we want to use the typed allocation interface as much as
> possible, to try and avoid pointer misdetection and the resultant
> blacklistings, heap fragmentation etc. The issue is that, with a
> conservative GC, even if we know that no object belonging to the
> domain can still be used and accessed, it may be still reachable 
> as far as the the GC is concerned. So the GC will try to
> access the GC descriptor inside the vtable which has been free()d
> and a crash results.
> To avoid the crash we had to avoid typed allocations except
> in the root appdomain, which lasts until application exit, so no GC
> can happen after it's unloaded.
> 
> The idea is to have a function in the GC that will call a callback
> for each live object: this way we can check if the object belongs to
> the unloading domain and we can change the vtable pointer to point
> to an area of memory that lasts and has a correct GC descriptor
> (which basically tells the GC there are no object references in 
> the object anymore).
> I think we could use this interface also to force finalization
> of the objects that need it (and that the GC thinks are still alive):
> we currently have a different solution to this that requires
> we keep a list of objects that need finalization, in addition to
> the GC-maintained one.
> 
> The idea and the implementation was taken by looking around in
> the code and cut&pasting it, but it seems to work with some 
> light testing.
> We basically performs the mark phase of the GC and then walk all
> the blocks and call the callback for marked objects.
> I think the functionality is general enough for other people as well
> or for other uses, like inspecting the heap from a debugger
> or in a profiler.
> Ideally, the callback should return a boolean to indicate to the GC
> that the object can and should be freed. I may try to add that feature
> as well if there are no objections.
> 
> Comments welcome.
> 
> Index: alloc.c
> ===================================================================
> --- alloc.c	(revision 39084)
> +++ alloc.c	(working copy)
> @@ -457,6 +457,70 @@
>      return(result);
>  }
>  
> +typedef struct {
> +    void (*apply_func) (void *, void *);
> +    void *data;
> +} ObjApply;
> +
> +static void 
> +foreach_live_object_in_block (struct hblk *h, word fn)
> +{
> +    hdr * hhdr = HDR(h);
> +    word sz = hhdr -> hb_sz;
> +    word descr = hhdr -> hb_descr;
> +    register word *p, *plim;
> +    register int word_no;
> +    int i = 0;
> +    ObjApply *data = (ObjApply*)fn;
> +
> +    p = (word *)(h->hb_body);
> +    word_no = 0;
> +
> +    if (sz > MAXOBJSZ) {
> +      plim = p;
> +    } else {
> +      plim = (word *)((((word)h) + HBLKSIZE) - WORDS_TO_BYTES(sz));
> +    }
> +    while( p <= plim ) {
> +      if( mark_bit_from_hdr(hhdr, word_no)) {
> +        data->apply_func (p, data->data);
> +      }
> +      word_no += sz;
> +      p += sz;
> +    }
> +}
> +
> +void GC_apply_to_live_objects(void (*apply_func)(void*, void 
> *), void *data)
> +{
> +    int result;
> +    int dummy;
> +    ObjApply apply_data;
> +    DCL_LOCK_STATE;
> +    
> +    DISABLE_SIGNALS();
> +    LOCK();
> +    if (!GC_is_initialized) GC_init_inner();
> +    STOP_WORLD();
> +    IF_THREADS(GC_world_stopped = TRUE);
> +    /* Minimize junk left in my registers and on the stack */
> +    GC_clear_a_few_frames();
> +    GC_noop(0,0,0,0,0,0);
> +    GC_invalidate_mark_state();  /* Flush mark stack.	*/
> +    GC_clear_marks();
> +    GC_initiate_gc();
> +    while (1) {
> +        if (GC_mark_some((ptr_t)(&dummy))) break;
> +    }
> +    apply_data.apply_func = apply_func;
> +    apply_data.data = data;
> +    GC_apply_to_all_blocks (foreach_live_object_in_block, 
> (word)&apply_data);
> +    IF_THREADS(GC_world_stopped = FALSE);
> +    START_WORLD();
> +    UNLOCK();
> +    ENABLE_SIGNALS();
> +    return;
> +}
> +
>  /*
>   * Assumes lock is held, signals are disabled.
>   * We stop the world.
> 
> lupus
> 
> -- 
> -----------------------------------------------------------------
> lupus at debian.org                                     debian/rules
> lupus at ximian.com                             Monkeys do it better
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 


More information about the Gc mailing list