[Gc] Re: Problems with GC_size_map
hans.boehm at hp.com
Fri Feb 5 17:16:20 PST 2010
It looks to me like the representations of p1 and q1 get to be about 100KB? But there are only about 4 such objects live at a time? I would have guessed that that should be OK, so long as the false pointer density isn't too high for other reasons. It might be good to look at GC_dump() output to see what the size of the root set is, and possibly GC_PRINT_STATS output to see if there are any obvious issues. It does look like the collector may somehow be seeing more false pointers than we would like.
This is a very recent collector? There were some recent fixes to reduce the number of statically allocated regions that need to be scanned on Linux.
From: gc-bounces at napali.hpl.hp.com [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Juan Jose Garcia-Ripoll
Sent: Tuesday, February 02, 2010 10:09 AM
To: Ludovic Courtès
Cc: gc at napali.hpl.hp.com
Subject: Re: [Gc] Re: Problems with GC_size_map
On Tue, Feb 2, 2010 at 6:39 PM, Ludovic Courtès <ludo at gnu.org<mailto:ludo at gnu.org>> wrote:
If bignums are stored in "atomic" memory regions, how could they lead to
My concern is not that the marking of the bignums is imprecise, but rather that the marking of the environment (C stack, interpreter stack, etc) is causing the set of blacklisted regions to grow and thus makes it more difficult to reclaim those bignums: the effort in the garbage collection process is larger. I use bignums precisely because they are atomic and thus what is measured is just the pressure on the garbage collector due to the environment, not the newly created objects. On a 64-bits systems the address space is large and I presume this helps in garbage collection: most pointers that are found in the stack are recognized as outside the region of memory that the garbage collector handles.
The code is pretty simple and is exhibited below (Common Lisp). The three timings below follow more or less the expected proportionality in a 64-bit operating system, but are a factor 4 larger using the same processor in 32-bit mode. It is not just the instruction set: in the 32-bit case statistical profiling reveals 80% time is spent in the mark phase of the garbage collector, while in the 64-bit case the numbers dropped down significantly.
I have now managed to implement four marking strategies
- Everything is allocated either with GC_MALLOC or GC_MALLOC_ATOMIC depending on the existence of pointers.
- Use of bitmaps and GC_malloc_explicitely_typed()
- Use of custom mark procedure by scanning field by field
- Use of custom mark procedure with globally stored bitmap
and they all give more or less the same timings within 10% differences. Strategy four seems to be optimal.
"computes numerator and denominator of exp(1) via continued
(let* ((p0 3)
(p1 (+ (* 6 p0) 1))
(q1 (+ (* 6 q0) 1))
(if (>= (* 2 (integer-length p0)) bits)
(return-from expinv (list p1 q1)))
(psetf p1 (+ (* i p1) p0) p0 p1)
(psetf q1 (+ (* i q1) q0) q0 q1)
(setf i (+ i 4)))))
(time (progn (expinv 100000) nil))
(time (progn (expinv 500000) nil))
(time (progn (expinv 1000000) nil))
Instituto de Física Fundamental, CSIC
c/ Serrano, 113b, Madrid 28006 (Spain)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Gc