[httperf] httperf: connection failed with unexpected error 99

Weppner, Harald harald.weppner at sap.com
Wed Feb 4 00:00:54 PST 2009


Hi Rick,

Thanks for the pointers - I increased the local port range and enabled the
fast time_wait recycling.. Looking good so far ;-)

Cheerio, Harry.

-----Original Message-----
From: Rick Jones [mailto:rick.jones2 at hp.com] 
Sent: Tuesday, Feb 03, 2009 10:51 AM
To: Weppner, Harald
Cc: httperf at napali.hpl.hp.com
Subject: Re: [httperf] httperf: connection failed with unexpected error 99

Weppner, Harald wrote:
> Hi there,
> 
> We've been trying to use httperf 0.9.0 in an experiment and from a given 
> connection rate onwards (around 500 req/s) it issues the error
> 
> httperf: connection failed with unexpected error 99
> 
> In errno.h EADDRNOTAVAIL is defined as 99 - is this message simply 
> stating that I ran out of sockets on the client?

Probably not exactly. Chances are you still have plenty of sockets/file 
descriptors. Likely as not it means you ran-out of local port numbers - if
you 
were to look I suspect you would find a great many TCP endpoints in
TIME_WAIT.

Assuming httperf is using anonymous/ephemeral port numbers (ie leaving local
port 
number selection to the stack) you will probably see a number of endpoints
in 
TIME_WAIT equal to the size of the ephemeral port space.

You can either increase the ephemeral port space, or get httperf to pick
port 
numbers itself from a range of say 5000-65535 (port numbers being 16 bit
unsigned 
quantities).

A TCP connection is named with the four-tuple of local/remote IP and
local/remote 
port.  A TCP endpoint will remain in TIME_WAIT for a certain length of time
(part 
of TCPs protection against accepting old, delayed segments on a new
connection - 
aka data corruption).  If the test is from a single IP address on the client
to a 
single well-known IP address and port on the server, then if the connection
rate 
exceeds:

sizeof(clientportspace)/lengthof(TIME_WAIT)

there will be an attempt to reuse the four-tuple of an endpoint which is
still in 
TIME_WAIT.  That will fail.

Things to do?  (In more or less the order I happen to prefer)

*) reduce the connection churn rate - do more in each connection.

*) use the maximum client port space possible, either by tuning the
ephemeral 
port limits, or via explicit port number selection via explicit bind() calls
in 
the client

*) use more than one IP address on the client with explicit bind() calls in
the 
application - each IP address gives another entire ephemeral port space

I would entertain reducing TIME_WAIT only as a very last resort.  TCP has 
TIME_WAIT for a purpose.  Yes, it might be "OK" to reduce it to low levels,
maybe 
even bypass it entirely - in the lab - but then you will have a
configuration 
that should *not* be used outside the lab and so will not match reality.

rick jones
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 1807 bytes
Desc: not available
Url : http://napali.hpl.hp.com/pipermail/httperf/attachments/20090204/e009f2fe/smime.bin


More information about the httperf mailing list