[Gc] Performance of bdwgc7.2 had degraded compared to6.8 - I think I found a solution/reason

Carsten Kehler Holst kehler at pdc.dk
Thu Mar 3 14:06:23 PST 2011

I finally came back to this.
As mentioned before the degradation we experienced couldn't be from a
change in whether ALL_INTERIOR_POINTERS were on or not as we had it on
both in gc 6.8 and in gc 7.2a2.

After some experimentation it looks as if at least part of the problem
is in GC_reclaim_clear in reclaim.c previously there were special cases
for small granularities which are now removed.
The code for clearing the content of the data are thus no longer
I unfolded the loop :

while (p < q) {
    *p++ = 0;
To : 
switch ((q-p) % 4) {
  case 0:
      p[1] = 0;
      p[2] = 0;
      p[3] = 0;
      p += 4;
 case 3:
      p[1] = 0;
      p[2] = 0;
      p += 3;
  case 2:
      p[1] = 0;
      p += 2;
  case 1:
  while(p < q)
      p[0] = 0;
      p[1] = 0;
      p[2] = 0;
      p[3] = 0;
      p += 4;
In a test program were a lot of lists were allocated this gave a 25%
speed improvement.
I don't know if this is a generally applicable change but if it is it
should probably be added.

I've also tried to improve the outer loop in a way that looks similar to
how the code looked in 6.8 and it gives 8-10% improvement in my,
admittedly, contrived example.

I was wondering whether other users of bdwgc had implemented special
treatment for small allocations.
We have a lot of list cells and similar small objects which seems to be
influenced by this.
If anyone have done something similar we would like to hear about their
In our production program we are trying to improve this seems to give
about 3-4% which unfortunately still doesn't reclaim the approx. 8% we
lost when going from 6.8 to 7.2a2.

Visual Prolog Team

-----Original Message-----
From: Ivan Maidanski [mailto:ivmai at mail.ru] 
Sent: 17. december 2010 21:04
To: Carsten Kehler Holst
Cc: Manuel.Serrano at inria.fr; Ludovic Courtes; gc at linux.hpl.hp.com
Subject: Re[4]: Fwd: [Gc] Performance of bdwgc7.2 had degraded compared
to6.8 - the patch to test

Hi Carsten,

1. What's about GC v7.1? Please tell us between which official release
you see the speed degradation.

2. Please announce the flags passed when building GC (in both cases).


Fri, 17 Dec 2010 19:49:32 +0100 "Carsten Kehler Holst" <kehler at pdc.dk>:

> Did you try 6.8 with ALL_INTERIOR_POINTERS turned on?
> Our problem is that we have had it on both in 6.8 and in 7.2a4.
> It is quite possible that the problem has to do with 
> ALL_INTERIOR_POINTERS but something must have changed between 6.8 and
> Regards
> Carsten
> Visual Prolog Team
> On 17/12/2010, at 18.03, "Manuel.Serrano at inria.fr"
> <Manuel.Serrano at inria.fr> wrote:
> >> If think the same (ALL_INTERIOR_POINTERS slow down the performance 
> >> of
> the benchmark).
> >> 
> >> I think -D ALL_INTERIOR_POINTERS should present by default when
> building the collector but the application which does not need 
> pointers to objects' interiors be recognized should call
> GC_set_all_interior_pointers(0) at runtime before GC_INIT().
> >> 
> >> PS. The presence of NO_EXECUTE_PERMISSION should typically 
> >> positively
> influence the speed, I think.
> > I confirm that too. ALL_INTERIOR_POINTERS is the one that slows down

> > the performance. The other two have no impact.d
> > 
> > Cheers,
> > 
> > --
> > Manuel

HWA siger: For at rapportere denne mail som spam:

This message has been scanned for malware by Websense. www.websense.com

More information about the Gc mailing list