[httperf] httperf: connection failed with unexpected error 99
rick.jones2 at hp.com
Tue Feb 3 10:51:25 PST 2009
Weppner, Harald wrote:
> Hi there,
> We’ve been trying to use httperf 0.9.0 in an experiment and from a given
> connection rate onwards (around 500 req/s) it issues the error
> httperf: connection failed with unexpected error 99
> In errno.h EADDRNOTAVAIL is defined as 99 – is this message simply
> stating that I ran out of sockets on the client?
Probably not exactly. Chances are you still have plenty of sockets/file
descriptors. Likely as not it means you ran-out of local port numbers - if you
were to look I suspect you would find a great many TCP endpoints in TIME_WAIT.
Assuming httperf is using anonymous/ephemeral port numbers (ie leaving local port
number selection to the stack) you will probably see a number of endpoints in
TIME_WAIT equal to the size of the ephemeral port space.
You can either increase the ephemeral port space, or get httperf to pick port
numbers itself from a range of say 5000-65535 (port numbers being 16 bit unsigned
A TCP connection is named with the four-tuple of local/remote IP and local/remote
port. A TCP endpoint will remain in TIME_WAIT for a certain length of time (part
of TCPs protection against accepting old, delayed segments on a new connection -
aka data corruption). If the test is from a single IP address on the client to a
single well-known IP address and port on the server, then if the connection rate
there will be an attempt to reuse the four-tuple of an endpoint which is still in
TIME_WAIT. That will fail.
Things to do? (In more or less the order I happen to prefer)
*) reduce the connection churn rate - do more in each connection.
*) use the maximum client port space possible, either by tuning the ephemeral
port limits, or via explicit port number selection via explicit bind() calls in
*) use more than one IP address on the client with explicit bind() calls in the
application - each IP address gives another entire ephemeral port space
I would entertain reducing TIME_WAIT only as a very last resort. TCP has
TIME_WAIT for a purpose. Yes, it might be "OK" to reduce it to low levels, maybe
even bypass it entirely - in the lab - but then you will have a configuration
that should *not* be used outside the lab and so will not match reality.
More information about the httperf