[Gc] Blacklist plausible heap extent heuristics

Boehm, Hans hans.boehm at hp.com
Mon Mar 8 14:32:29 PST 2004


David Jones pointed out that I did miss something:  The sense of the comparison
against 5*HBLKSIZE*MAXHINCR was the opposite of what I read it as, making that
a minimum.

Thus I agree with David that there's a problem here: We consider too large an
address space region as likely to become part of the heap in the future.  This
results in overly aggressive blacklisting, especially if the collector is configured
without -DLARGE_CONFIG, since the blacklisting hash table quickly becomes too small.
I also suspect it can cause blacklisting too fail in interesting ways for heaps
larger than a GB or so due to address wrapping.  (I think this wouldn't outright
break the collector, but it may effectively turn off blacklisting.)

I've attached a path that should greatly reduce address ranges considered for blacklisting,
at the expense of very occasionally forcing an extra GC to avoid expanding the heap into areas
that haven't been checked into false pointers.  This has only been minimally tested.  I'd
appreciate either reports as to whether this helps with large heap applications, or other
comments on the patch.  Unfortunately, getting this code wrong tends to impact only a
few applications, so it's relatively hard to test well.

(The patch is against my current tree.  But it should apply to older versions, since
the affected code is ancient.)

Hans

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com
> [mailto:gc-bounces at napali.hpl.hp.com]On Behalf Of Boehm, Hans
> Sent: Friday, March 05, 2004 5:00 PM
> To: 'David Jones'
> Cc: gc at napali.hpl.hp.com
> Subject: RE: [Gc] Blacklist plausible heap extent heuristics
> 
> 
> Thanks for posting.
> 
> There is definitely an issue with larger heaps and a 
> collector built without
> -DLARGE_CONFIG.  Probably the practical limit is at around 
> 200 MB (see below), which
> is less than I thought it was.  I added a comment to the definition of
> LOG_PHT_ENTRIES which controls this limit.  And it may be 
> about time to increase
> the default again.
> 
> But in looking at this again, I don't quite understand the
> comment about very large heaps.  Looking at the
> alloc.c code, the next lines limit expansion_slop to 
> 5*HBLKSIZE*MAXHINCR.
> MAXHINCR is 4K for LARGE_CONFIG, and 2K otherwise.  HBLKSIZE 
> is 4K on 32bit
> machines.  Hence this limit should be 80MB.  (And the code 
> guards against
> wrapping, I think.)
> 
> Thus I'm not sure why you're seeing more than 80MB 
> expansion_slop with a 350MB heap.
> Did I miss something?
> 
> Hans
> 
> 
> 
> > -----Original Message-----
> > From: gc-bounces at napali.hpl.hp.com
> > [mailto:gc-bounces at napali.hpl.hp.com]On Behalf Of David Jones
> > Sent: Thursday, March 04, 2004 4:59 PM
> > To: gc at napali.hpl.hp.com
> > Subject: [Gc] Blacklist plausible heap extent heuristics
> > 
> > 
> > Another issue that I have come across while investigating 
> > memory use is the 
> > heuristic used to determine that a given 32-bit blob is a 
> > plausible reference 
> > to a heap address.
> > 
> > The collector will treat a blob that "points" slightly above 
> > the current end 
> > of the heap as plausible, assuming that the heap will soon 
> > grow such that the 
> > reference becomes real.  However, the heuristic used does not 
> > work well for 
> > me.
> > 
> > Around line 927 in alloc.c a slop factor is computed:
> >  
> >  expansion_slop = 8 * WORDS_TO_BYTES(min_words_allocd());
> >  
> > I am not sure about the reasoning behind this heuristic, but 
> > I found that 
> > expansion_slop was just under 3x the total heap size for me. 
> > For a 350 MB 
> > heap, the plausible ending heap address actually surpassed 
> > 0x50000000.  Once 
> > you get to this point, the probability of an arbitrary 32-bit blob 
> > "referencing" a plausible heap address is 0.25 or greater.  
> > Uppercase letters 
> > in strings in the text/data segment (root set) now 
> > "reference" locations in 
> > the 0x41000000-0x5a000000 range.  Once the heap size hits 1.5 
> > GB (entirely 
> > possible on a 32-bit machine) then effectively the entire 
> > address space is 
> > considered plausible.
> > 
> > There are two consequences to this:
> > 
> > 1. Unless -DLARGE_CONFIG is used, the blacklist cache will 
> > alias a page onto
> >    pages +/- multiple of 256MB from it.  Once the heap gets 
> > over 100 MB, it
> >    will start blacklisting itself to death as every page 
> > tends to alias to
> >    a blacklisted page.  -DLARGE_CONFIG effectively prevents 
> > aliasing on
> >    32-bit systems which lets you live until...
> > 
> > 2. At 1.5 GB, again you blacklist yourself to death.  I have 
> > not actually
> >    tried this, as my machine is not endowed with that much 
> > physical RAM.
> >    :-)
> > 
> > I am noticing plausible addresses over 1G for heaps in the 
> > 350-400MB range.
> > Given that FreeBSD/i386 supports by default 3GB user space, 
> I expect 
> > blacklist-related problems once my heap hits 1G, even with 
> > -DLARGE_CONFIG.
> > 
> > _______________________________________________
> > Gc mailing list
> > Gc at linux.hpl.hp.com
> > http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> > 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: alloc.c.diff
Type: application/octet-stream
Size: 2389 bytes
Desc: not available
Url : http://napali.hpl.hp.com/pipermail/gc/attachments/20040308/94fc07ba/alloc.c.obj


More information about the Gc mailing list