Re[2]: [Gc] Back to "GC Stack problem on Win32"

Ivan Maidanski ivmai at mail.ru
Thu Nov 13 15:07:49 PST 2008


Hi!

"Boehm, Hans" <hans.boehm at hp.com> wrote:
> Ivan -
> 
> Thanks.  However, I don't think I understand the problem here correctly. The old code should in nearly all cases only invoke GC_may_be_in_stack(thread -> last_stack_min).  This should be cheap, since it only walks a page or so of the stack, right?

Right, but only if the stack hasn't grown too much between collections.
By default, Win32 apps has StackCommit==4K. So, if the stack grew by 4MB then VirtualQuery() would be called 1000 times during nearest collection. But for GC_push_stack_for() only one VirtualQuery() is really required - just to check sp value.

> 
> It seems like it would usually result in exactly one extra VirtualQuery call, as it walks off the end of the stack.  Since GC_may_be_in_stack() makes one call anyway, it seems to me that it should at most double the time spent there.

Try this test app:

#include "gc.h"

void GC_printf();

int f(int n) {
 return n > 0 ? f(n - 1) + 1 : 0;
}

void test(char c, int n) {
 n = f(n);
 GC_printf("\n Test%c: N= %d\n\n", c, n);
 GC_gcollect();
}

void *obj;

int main(void) {
 int n;
 int max = 9 * 1000 * 1000;
 obj = GC_MALLOC(16);
 for (n = 100 * 1000; n <= max; n += n >> 1) {
  test('A',n);
  test('B',n);
 }
 GC_printf("Done");
 return 0;
} // end

Compile it with -fno-optimize-sibling-calls -Xlinker --stack -Xlinker 0x10000000
Set GC_PRINT_STATS=1 to see the world-stopped delays timing.

If You use Your code or just comment out one line "if (sp < stack_min || sp >= thread->stack_base)" in GC_push_stack_for() then you see how much time is required to collect just a few bytes.

> 
> I don't like the reference to last_info in the patch, since that relies on side effects of GC_may_be_in_stack that I would like to keep as a private implementation detail of  GC_may_be_in_stack and GC_get_stack_min.  Is there a reason not to use thread -> last_stack_min instead of last_info.BaseAddress?

I don't like it too. But to say the truth, "caching" here works realy only to pass BaseAddress from GC_may_be_in_stack() to GC_get_stack_min() even in case of one thread. Another solution (without "caching") is to make GC_may_be_in_stack() return BaseAddress or NULL (instead of bool) - simple to change (but the func should be renamed, may be).

> 
> Hans
> 
> > ...
> > I've already pointed out some bugs.
> > Now I've just run the same test (having Your patch applied)
> > as I had run for my patch... And it turns out that Yours
> > doesn't solve the problem as it was originally stated (to say
> > more precisely, it doesn't reduce time for the first
> > collection after stack growth).
> >
> > Look into the things You had advised me before:
> > > 1) Initially call VirtualQuery on the sp.  If the stack
> > base is in the same region, we know we're OK, and don't need
> > GC_get_stack_min.  Hopefully this will be true about 100% of the time.
> >
> > So I did it for Your code now. It works.
> >
> > The patch is attached.
> >
> > Bye.

Bye.



More information about the Gc mailing list