[Gc] Newbie: how to deal with a giant linked list?

Jan Wielemaker J.Wielemaker at vu.nl
Sun Dec 18 03:41:23 PST 2011


On 12/18/2011 11:21 AM, Jan Wielemaker wrote:
> Hi,
>
> I've got the following problem. I must store a multi-way linked list
> (i.e., each cell has a (fixed) array of `next' pointers). This list is
> huge. My current tests are with about 10 million objects, but we want to
> be able to use several 100 millions. The app is multi-threaded and I
> want to
> be able to have writers that throw cell out and readers that
> concurrently walk over the data without locking.
>
> That is why I started considering Boehm-GC: I can make this work by
> locking the writers and by being careful about the ordering of the
> memory operations readers can do their work unlocked. Using Boehm-GC, I
> can leave the actual destruction of cells to the collector.
>
> Unfortunately, this doesn't scale. GC time on 10M cells is 1 sec (on 8
> cores; 7.5 sec CPU time). Is there a way to solve this?
>
> One line I was thinking along is this: alloc the cells using
> GC_MALLOC_ATOMIC_UNCOLLECTABLE(). That works just fine (GC time is about
> 0.01 sec.) Now, before removing a cell, I would like to be *modify* the
> state of the cell to what GC_MALLOC() would have created. I glanced
> through the source. This looks hard. (Im-)possible?
>
> Opinions?

Sorry for the self-reply. It just struck me that the stubborn interface
might solve this and I generated some experimental data using four
scenarios. I guess the scenario names are clear (malloc is using system
malloc).

System: Ubuntu 11.04, 64-bits on Intel i7 K2600, 16Gb main memory. Mem
is RES as reported by top. Load is the time to load the 10M cells from
file. GC is the time for a GC_collect(), GC-2 calling it the a second
time. First number is wall-time, 2nd is process CPU time (note that this
is not 8 cores, but 4 hyperthreaded. They perform like 8 here). All
times are in seconds.

================================================================
	GC_MALLOC	Stubborn Atomic-Uncollectable  malloc
================================================================
Mem	 1.6Gb		1.6Gb		1.4Gb	       1.5Gb
Load	25.9	       25.3	       20.1	      21.7
GC	 1.0/7.7	3.0/9.1		0.18/1.2       0.15/1.0
GC-2	 1.0/7.0	1.0/7.7		0.18/1.3       0.14/0.9
================================================================

Note that uncollectable times are not as low as reported above, but this
is because the quick fix to switch scenarios still causes some of the
allocations to go through GC_MALLOC.  Most of that data has similar
characteristics.

The 3.0 seconds wall-time for the first GC after loading with Stubborn
is a bit weird, but I measured it twice. Does stubborn affect the
concurrency of the collector?

So	- Why does stuborn not help here?  Note that it does help
	  in places where we have large non-mutable objects.
	- My guess is that this is because the objects are known
	  to be collectable, which causes all of them to be visited.
           This is backed up by seeing similar figures for
           GC_MALLOC_ATOMIC().
	- Seems that being able to migrate objects from unmanaged
	  to managed (without copying their data!) would really help.
           Is the vital step from uncollectable to collectable?  Is
	  that feasible?

Sorry for the noise. I'm in the phase where I wish to decide to adopt
Boehm-GC. I most likely will do so if there is a solution to the above.
Either something I missed, or the possibility to enhance the collector
and have the changes in the main repository (I don't want to create a
fork for this project).

	Thanks --- Jan



More information about the Gc mailing list