[Gc] Problems with GC performance using gcj (Repost text only)
bkeppler at tridentms.com
Thu Jan 13 11:20:26 PST 2011
To provide a little more information on our setup, GC_PRINT_STATS reports that we are running with a heap in the 71MB size range. Our hardware is as follows:
CPU Model Name: Intel Celeron M
CPU Model No.: 13
CPU Model Family: 9
CPU Speed: 1 GHz
CPU Cache: 64 KB
Physical Memory: 1 GB
Updated hardware is not an available alternative to solve the problem.
We have had trouble finding information on how to tune the garbage collector including how to set heap size, preallocate the heap, or adjust collection strategies. We have been using gcj to compile our Java source to native code and have to this point accepted the default settings. It was only through substantial searching that we discovered how to condition the Boehm GC to output garbage collection information (e.g. the GC_PRINT_STATS environment variable). Any information you could provide on where to find what configuration knobs are available would be appreciated.
Ben Keppler, Software Engineer
Trident Micro Systems
E-mail: bkeppler at tridentms.com * Voice: 828.684.7474 * Fax: 8282.684.7874
From: bruce.hoult at gmail.com [mailto:bruce.hoult at gmail.com] On Behalf Of Bruce Hoult
Sent: Tuesday, January 11, 2011 7:31 PM
To: Ben Keppler
Cc: gc at linux.hpl.hp.com
Subject: Re: [Gc] Problems with GC performance using gcj (Repost text only)
On Wed, Jan 12, 2011 at 4:58 AM, Ben Keppler <bkeppler at tridentms.com> wrote:
> NOTE: Reposting as using text only.
> NOTE: This question was also posted on the gcj mailing list.
> We are using gcj for a time sensitive application. One of the requirements of the application is that messages be transmitted within a 60ms timeframe. Unfortunately, the Boehm GC used in gcj is not generational and thus every collection is of the "stop the world" variety. We are observing (using "GC_PRINT_STATS") regular collections that stop the world for periods in the 400ms range. This is a problem for us.
How big is your heap? And what is the CPU?
I'm seeing pause times of ~40 mS in a 75 MB heap on the d2c Dylan
compiler on my Core i7 @3.44 GHz. That with a pure stop-the-world
p.s. parallel-mark makes it *slower*, no matter whether I use anything
between 2 and 8 threads, so that is off
p.s.2. I actually get about 20% more throughput by preallocating the
heap to around 256 - 512 MB. There are still a few GCs but a lot fewer
and they still take ~40 mS even in the larger heap (the amount of live
data is the same). A typical compile for my test program allocates
around 1.4 GB. Disabling GCs/using GC_malloc_uncollectable, or using
malloc(), or even a simple bump-the-ptr malloc is slower than using
GC_malloc -- at some point it takes more time to create VM space than
to GC what you already have.
More information about the Gc