[Gc] Re: Performance of bdwgc7.2 had degraded compared to 6.8 - the
patch to test
Ludovic Courtès
ludo at gnu.org
Thu Dec 2 12:53:09 PST 2010
Hi Ivan,
Ivan Maidanski <ivmai at mail.ru> writes:
> It seems the observed degradation can be discovered by 2 tests:
> 1) by benchmarking v71 vs v72a2+test2_patch;
> 2) by benchmarking v71 vs v72a2+test3_patch.
>
> test2 patch reverts the relevant changes of:
> 2008-08-21 Hans Boehm <Hans.Boehm at hp.com>
>
> test3 patch reverts the relevant changes of:
> 2009-05-22 Hans Boehm <Hans.Boehm at hp.com> (Largely from Ludovic Cortes)
Damn, I feel guilty now. ;-)
Did you measure the effect of each patch individually? It would be
interesting to know.
For the record, the discussion that led to the second patch started here:
http://thread.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/2570
The next-to-final patch was posted here:
http://thread.gmane.org/gmane.comp.programming.garbage-collection.boehmgc/2634
The intent was to /exclude/ ELF sections containing relocated read-only
data from the GC roots on GNU systems, thereby reducing the amount of
memory that needs to be scanned.
The effect on libguile was discussed here:
http://thread.gmane.org/gmane.lisp.guile.devel/9247
In this message, I wrote:
However, when not linking with `-z relro', static allocation leads to
slightly degraded performance and increased heap usage (perhaps due to
misidentified pointers in the `.data.rel.ro' section?). This is
probably worth some investigation on the BDW-GC side.
So, ahem, I feel twice as guilty now...
Could the problem be caused by the search for LOAD segments when a
PT_GNU_RELRO is encountered, in GC_register_dynlib_callback? The code
does:
--8<---------------cut here---------------start------------->8---
for( i = 0; i < (int)(info->dlpi_phnum); ((i++),(p++)) ) {
switch( p->p_type ) {
case PT_GNU_RELRO:
/* This entry is known to be constant and will eventually be remapped
read-only. However, the address range covered by this entry is
typically a subset of a previously encountered `LOAD' segment, so
we need to exclude it. */
{
for (j = n_load_segs; --j >= 0; ) {
--8<---------------cut here---------------end--------------->8---
It does look quadratic to me.
However, it would only impact initialization time, which is probably
negligible on long-running programs (such as the Bigloo benchmarks, I
suppose).
Hmm, OTOH, GC_is_visible calls GC_register_dynamic_libraries, which runs
the above code, so this could be a problem.
What do you think?
Thanks,
Ludo’.
More information about the Gc
mailing list