[Gc] Newbie: how to deal with a giant linked list?

Jan Wielemaker J.Wielemaker at vu.nl
Sun Dec 18 11:00:07 PST 2011


Hi Bruce,

Thanks for the reply.

On 12/18/2011 03:55 PM, Bruce Hoult wrote:
> On Sun, Dec 18, 2011 at 11:21 PM, Jan Wielemaker<J.Wielemaker at vu.nl>  wrote:
>> Hi,
>>
>> I've got the following problem. I must store a multi-way linked list
>> (i.e., each cell has a (fixed) array of `next' pointers). This list is
>> huge. My current tests are with about 10 million objects, but we want to
>> be able to use several 100 millions. The app is multi-threaded and I want to
>> be able to have writers that throw cell out and readers that
>> concurrently walk over the data without locking.
>
> Parallel marking is unlikely to help unless you can start different
> threads in very different parts of the list, or they diverge quickly.

Well, that appears to work rather well, considering that CPU time/wall-time
is about 7 on a hyper-threaded quad-core.

> Can you actually scan through the list faster than the GC time? You
> could well be running into cache and/or TLB thrashing if the Next
> links don't usually point to items on the same memory page and each
> page is visited many times.

I don't know the answer to that question. Bottom line is that I do not
care. As long as cells are part of the structure, I don't need to
have GC dealing with them. I want GC starting to deal with them from the
moment I disconnect them from the global structure because they might be
involved in some operation by some other thread (i.e., I'm mostly
interested in references from registers or the stacks of other
threads).

> You can't change an object from uncollectable to collectable. They are
> stored in completely different pools. There are never collectable
> objects on the same memory page as uncollectable ones.  The same is
> true of atomic and non-atomic.

That is (one of) the part(s) of the discussion I'm interested in. Yes,
they seem to be on a different page. So far, it seems that a page is
allocated for objects of size N and type K. Then, these objects are
combined into a linked list. Moving an object to a different linked list
should be feasible. But, I'm a newbie where it concerns the Boehm GC
code and I'm interested to know whether I'm overlooking some fundamental
limitation of the implementation.

In other words, I'm trying to find out whether

    (a) There is a much better plan to my problem, being ...
    (b) Changing object types is feasible, deemed useful and a patch
        that allows for this will be accepted into the main source
        (provided ...).  Note that I might need some guidance, but I'm
        perfectly willing to do the dirty work myself.

	Cheers --- Jan


More information about the Gc mailing list