[Gc] Questions about design choices wrt mmap

Boehm, Hans hans.boehm at hp.com
Sun Apr 17 17:36:50 PDT 2011

> -----Original Message-----
> From: gc-bounces at linux.hpl.hp.com [mailto:gc-bounces at linux.hpl.hp.com]
> On Behalf Of Erik Groeneveld
> Sent: Friday, April 15, 2011 2:22 AM
> To: gc
> Subject: [Gc] Questions about design choices wrt mmap
> L.S.,
> I am studying the GC because I'd like to fight memory fragmentation.
> It seems to me that mmap could really solve the problem of
> fragmentation best. When I look at the code however, I have a few
> questions, I couldn't find answers on on the site or in the code.
> 1.
> Unused hblks are only unmapped after they have not been used for some
> time. What is the reason to not unmap them as soon as they become
> free? Costs?
Right.  Unmapping and remapping forces the kernel to clear the block.  Having said that, I have little confidence that the logic here is anything close to optimal.
> 2.
> Unmapped hblks are not really unmapped, but remapped as inaccessible.
> (PROT_NONE).  What is the reason for this?
That should also release the underlying memory.  It prevents those blocks from being reused by something else, such as an intervening load of an unrelated dynamic library.
> 3.
> Later, unmapped blocks might be merged with adjacent free blocks by
> GC_merge_unmapped().  Why not unmap them all and map a new block when
> needed?
Same problem: potential cost in the kernel.  Same disclaimer.
> 4.
> If it were possible to just unmap blocks and use mmap for new blocks,
> then the fragmentation would completely vanish, it that true?
Basically yes.  And that may not really be a bad option.  Some experiments may be in order.  If we did that, we should probably track kernel-cleared pages, so that we don't clear them again.

We may need to be a bit careful here.  I need to spend more time looking back at your earlier messages, but it seems likely to me that you're exercising a particularly bad case here, and that the 7.x collectors should at least compensate by increasing GC frequency.  It may well be that always immediately unmapping large blocks can be made to win fairly consistently, but we should try some currently better-behaved tests as well.

> 5.
> Doing 4. would probably lead to many heap segments, as it would of
> course be necessary to let mmap decide about the addresses, is that a
> hard problem, to avoid at all costs?
We currently assume that we can remap at the same address, see (2) above.  I don't see a reason to change that strategy.  I currently still believe that approach is sound.
> I do have some thoughts about the answers to these questions of
> course, but I might be completely wrong.  To me, it seems that the
> collector tries to maintain a more or less consecutive heap, reusing
> blocks and filling up the gaps as efficiently as possible.  So here is
> my meta-question:
> 6. What if one would not try to be efficient but just let mmap do its
> work, at least for large blocks, how would that change the answers to
> the questions above?  Would it enable a solution for the fragmentation
> problem?
I think it would solve the problem of not being able to reuse physical memory due to fragmentation.  You might still lose address space to fragmentation, which may be an issue on 32-bit machines.  I don't know how aggressive the kernel mmap implementation is in trying to avoid this problem, so it could get worse.

> Erik
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> https://www.hpl.hp.com/hosted/linux/mail-archives/gc/

More information about the Gc mailing list