[Gc] Has anyone tried the GC on Nokia 770, N800 or arm qemu ?

Boehm, Hans hans.boehm at hp.com
Wed Jan 31 11:35:43 PST 2007


 
(I attached a copy of Stephane's message, with the log reformatted.)

This looks very strange indeed.  The signal delivery that would cause
10363 to enter the suspension handler seems conspicuously missing,
leaving it a mystery as to how it got to the handler.  Since 10363
printed the initial but not the final message from GC_stop_world(), it
must have somehow been forced into the signal handler.

I would expect either an emulator problem or a remote chance of a stack
overflow.  (Is this running in full system emulation mode?  If so, which
ARM kernel is it using?  Might that be old?  If not, does the user level
emulation handle threads correctly?  Might there be a mismatch between
the ARM thread library and the underlying P4 kernel?)

Zoltan's message also points in those directions, I think.

Hans
-------------- next part --------------
From: Stephane Epardaud [Stephane.Epardaud at sophia.inria.fr]
Sent: Wednesday, January 31, 2007 2:41 AM
To: Boehm, Hans
Cc: gc at napali.hpl.hp.com
Subject: Re: [Gc] Has anyone tried the GC on Nokia 770, N800 or arm qemu
?

Boehm, Hans wrote:
> Is this a recent and reasonably standard Linux kernel?  This looks 
> superficially like pthread_kill initiated signals sometimes get 
> misdirected to the wrong thread?

I'd say so myself, but I have a hard time believing it. This is a 2.6.17 linux kernel running on a P4, under ARM emulation by qemu, but I believe the applications are on emulated ARM, not the kernel itself. I'm starting to believe it's this devkit that's fubar.

> Does strace work sufficiently that you can verify or refute that theory?

Here's the end session:

[pid 10363] write(1, "Stopping the world from 0x8003\n", 31
***Stopping the world from 0x8003) = 31
[pid 10363] getpid()                    = 10363
[pid 10363] write(1, "Sending suspend signal to 0x4000"..., 33
***Sending suspend signal to 0x4000
	) = 33
[pid 10363] kill(10357, SIGPWR <unfinished ...>
[pid 10357] <... rt_sigsuspend resumed> ) = ? ERESTARTNOHAND (To be restarted)
[pid 10363] <... kill resumed> )        = 0
[pid 10357] --- SIGPWR (Power failure) @ 0 (0) ---
[pid 10363] write(1, "result: 0\n", 10
***result: 0
	<unfinished ...>
[pid 10357] rt_sigreturn(0xbfb3f220 <unfinished ...>
[pid 10363] <... write resumed> )       = 10
[pid 10357] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)
	## I'm wondering what this is (perhaps sem_wait(&GC_suspend_ack_sem))?):
[pid 10363] rt_sigprocmask(SIG_BLOCK, ~[INT QUIT ABRT TERM],  <unfinished ...>
[pid 10357] rt_sigsuspend([] <unfinished ...>
[pid 10363] <... rt_sigprocmask resumed> [RTMIN], 8) = 0
[pid 10363] rt_sigaction(SIGSEGV, {0x80516c0, [], SA_RESTORER, 0x8084ee8}, {0x8051390, ~[KILL STOP], SA_RESTORER|SA_SIGINFO, 0x8084ee0}, 8) = 0
[pid 10363] rt_sigaction(SIGSEGV, {0x8051390, ~[KILL STOP], SA_RESTORER|SA_SIGINFO, 0x8084ee0}, {0x80516c0, [], SA_RESTORER, 0x8084ee8}, 8) = 0 ## And how do we get there ?
[pid 10363] write(1, "Suspending 0x8003\n", 18) = 18
***Suspending 0x8003
[pid 10363] write(5, "\1\0\0\0\4\0\0\0\240\322\nB\0\0\0\0\0\0\0\0@\0\0\0\0\0"..., 148 <unfinished ...>
[pid 10361] <... poll resumed> [{fd=4, events=POLLIN, revents=POLLIN}], 1, 2000) = 1
[pid 10363] <... write resumed> )       = 148
[pid 10361] getppid( <unfinished ...>
[pid 10363] write(1, "suspending 0x8003\n", 18 <unfinished ...>
***suspending 0x8003
[pid 10361] <... getppid resumed> )     = 10357
[pid 10363] <... write resumed> )       = 18
[pid 10361] read(4,  <unfinished ...>
[pid 10363] rt_sigsuspend(~[INT QUIT ABRT TERM XCPU] <unfinished ...>
[pid 10361] <... read resumed> "\1\0\0\0\4\0\0\0\240\322\nB\0\0\0\0\0\0\0\0@\0\0\0\0\0"..., 148) = 148
[pid 10361] poll( <unfinished ...>
[pid 10362] sched_yield()               = 0
[pid 10362] sched_yield()               = 0
...
DEADLOCK

It seems the signal is sent to the correct thread...


More information about the Gc mailing list