
Summary
The O(1) scheduler can get itself into a scenario where two tasks
(a) share a single CPU and (b) always are runnable, yet one task
completely starves out execution in the other task.
Description
The problem is most easily illustrated on a single-CPU machine with
the test program below. The
program creates two tasks, with the child task constantly sending
SIGUSR1 to the parent and the parent simply counting until it
receives 10,000,000 signals. With the original Linux scheduler, the
two tasks share the CPU relatively equitably (about 60% vs 40%). With
the O(1) scheduler, the two start out working just fine but after a
few seconds, the child takes over and consumes all available CPU time
(over 99%).
Status
Fixed: This problem has been fixed
in recent versions of the 2.5/2.6 kernel. Finer-grained accounting
seems to have been needed for this issue to go away. The problem
does not occur, e.g., in kernel v2.6.0-test7.
- Analysis: Likely a bug in the dynamic priority
adjustment. The starvation starts to happen when the dynamic priority
of the two tasks is equal. The dynamic priority of both tasks then
stays stuck at that value. What ought to happen is that the parent
should eventually get a higher priority because it relinquishes the
CPU via sleep() after handling a signal.
- Scope:
believed to affect all O(1)-based kernels. Problem has been confirmed to
exist on:
- single-CPU 600MHz Pentium 3, kernel v2.4.20-8, Red Hat 9
- single-CPU 900MHz Itanium 2, kernel v2.5.69, Debian 3.0
- Workaround:
Renicing the child by just one priority level makes the problem go
away.
Credits
Thanks to Stéphane Eranian for reporting this problem and for
providing a test-case.
click here to download the test program
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
volatile unsigned long loop = 10000000;
void
handler (int n)
{
if (loop > 0)
--loop;
}
static int
child (void)
{
pid_t ppid = getppid ();
sleep (1);
while (1)
kill (ppid, SIGUSR1);
return 0;
}
int
main (int argc, char **argv)
{
pid_t child_pid;
int r;
loop = argc > 1 ? strtoul (argv[1], NULL, 10) : 10000000;
printf ("expecting to receive %lu signals\n", loop);
if ((child_pid = fork ()) == 0)
exit (child ());
signal (SIGUSR1, handler);
while (loop)
sleep (1);
r = kill (child_pid, SIGTERM);
waitpid (child_pid, NULL, 0);
return 0;
}
|