[Gc] Re[2]: Performance of bdwgc7.2 had degraded compared to6.8 - I think I found a solution/reason

Ivan Maidanski ivmai at mail.ru
Mon Mar 7 02:02:36 PST 2011


Hi Carsten,

Do you have a suggestion how to remove the regression?
If not, could you try to compare 6.8 and, e.g., 7.0a2 as there have been a lot of various changes between 6.8 and 7.2.
Thanks.

Mon, 7 Mar 2011 10:48:22 +0100 "Carsten Kehler Holst" <kehler at pdc.dk>:

> My test program first allocates a list of large atomic blocks to lower
> the frequency of the gc.
> It the allocates a lot of lists and keeping pointers to a few of them
> (to get blocks which are not completely empty)
> This program has a marked difference in runtime when comparing 6.8 and
> 7.2a2 (approx. 30%)
> The performance can be regained by the improved clearing.
> 
> The program looks as follows (visual prolog)
> 
> class facts
> l : pointer* := [].
> list : (testType).
> bin : (binary).
> count : positive := 0.
> 
> clauses
> runAllTest():-
> profileTime::init(),
> % preallocate some memory 100 blocks of 1 Mb
> l := [ memory::allocAtomicHeap(1000000)  || X = std::fromTo(1,
> 100)],
> foreach N = std::fromTo(1,3) do
> profileTime::start_pr("loop"),
> allocTest(80000000),
> profileTime::stop_pr("loop")
> end foreach,
> profileTime::printAndReset(stdio::getOutputStream()).
> 
> class predicates
> allocTest : (positive Count).
> clauses
> allocTest(0) :- !.
> allocTest(N) :-
> L = getstruct(4), %
> if count < 10000, N mod 100 = 0 then
> % save 1 list out of 100 and only the first 10000 
> assert(list(L)),
> count := count + 1
> end if,
> allocTest(N-1).
> 
> class predicates
> getstruct : (byteCount N) -> positive*.
> clauses
> getStruct(0) = [] :- !.
> getstruct(N) = [4|getStruct(N-1)].
> 
> -----Original Message-----
> From: Manuel.Serrano at inria.fr [mailto:Manuel.Serrano at inria.fr] 
> Sent: 5. marts 2011 19:38
> To: Carsten Kehler Holst
> Cc: Ivan Maidanski; Ludovic Courtes; gc at linux.hpl.hp.com
> Subject: Re: Performance of bdwgc7.2 had degraded compared to6.8 - I
> think I found a solution/reason
> 
> Hi Carsten,
> 
> I have tried the following in reclaim.c but I have not noticed any
> performance difference between BGL_MEMCPY1, BGL_MEMCPY2, and default.
> What have you tried exactly?
> 
> -----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
> -----
> --- gc/bdwgc-7_2alpha5-20110107/reclaim.c       2010-03-05
> 15:26:16.000000000 +0100
> +++ /tmp/reclaim.c      2011-03-05 19:32:05.000000000 +0100
> @@ -142,9 +142,25 @@
> }
> #                   else
> p++; /* Skip link field */
> +#define BGL_MEMCPY1
> +#if defined( BGL_MEMCPY1 )
> +                     switch ( (q-p) % 4 ) {
> +                        while (p < q ) {
> +                           case 0: *p++ = 0;
> +                           case 1: *p++ = 0;
> +                           case 2: *p++ = 0;
> +                           case 3: *p++ = 0;
> +                        }
> +                     }
> +#else
> +#  if defined( BGL_MEMCPY2 )
> +                     memcpy( p, 0, (q-p)*sizeof(p) );
> +#  else                      
> while (p < q) {
> *p++ = 0;
> }
> +#  endif                     
> +#endif               
> #                   endif
> }
> bit_no += MARK_BIT_OFFSET(sz);
> -----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
> -----
> 
> 
> --
> Manuel



More information about the Gc mailing list