[httperf] Bad file descriptor in select
Rick Jones
rick.jones2 at hp.com
Thu Feb 15 09:48:28 PST 2007
> if you are not sure if it is running out of file descriptors, one way to
> check is the following; re-run your experiment with a high request rate
> until the error occurs. as soon as the error occurs, check how many
> connections are in the TIME_WAIT state.
> depending on which OS you are using, you should be able to run the
> following command:
>
> netstat -an | grep TIME_WAIT | wc -l
>
> if that returns a number about as large as the number of file descriptors
> available to httperf, then that is likely the cause.
A TCP connection in TIME_WAIT should not still be associated with a
socket/file descriptor. Well, I _guess_ it could be if the remote had
sent a FIN AND httperf had done shutdown() AND not yet called close()
but should that be possible?
Now, what would be interesting to know in the area of TIME_WAIT is if
the number of connections in TIME_WAIT is near the size of the local
port number space available to httperf if it is relying on an implicit
bind() call when making its connections, or close to the size of the
port space httperf uses if it makes explicit bind() calls before calling
connect().
If there aren't enough FD's or if the available local ports have all
been consumed, then one would expect either a socket() or connect() (or
bind()) call to fail. Presumably httperf would be paying attention to
that, but then bugs happen. One might be able to see the failures
happen if one were to take a system call trace of httperf while it is
running - strace on Linux, tusc on HP-UX, truss on Solaris etc etc...
rick jones
More information about the httperf
mailing list