[httperf] httperf: connection failed with unexpected error 0 & low performance

Rick Jones rick.jones2 at hp.com
Fri Oct 19 11:47:40 PDT 2012

On 10/19/2012 11:19 AM, Vikash Kumar wrote:
> Hi Rick,
>     System parameters are set up to optimal limit like fd's in server
> and client are raised to 400000
>     Local port range is also much higher. tcp fin-timeout has been
> reduced to 30 secs etc. After setting all these things we are getting
> higher no. of connection with intel back to back. TCP recycle and reuse
> options are also 1.

The highest the local port range can be is gated by it being a 16-bit 
signed number.

>     Point is after the setup of 3 node topology and requesting
> connection via haproxy, performance is dropping drastically.

Assuming all your network statistics are "clean" then next I would look 
to see if any one of the CPUs on the system on which the haproxy is 
running are saturated.  In broad, handwaving terms, your haproxy is 
doing twice as much work as either the client or the server.

The client and server will see every packet once basically.  The haproxy 
(unless it is caching) will see everything twice.  It will have to 
accept the connection from the client, then establish a connection to 
the server.  It will see the request from the client and have to send it 
to the server.  It will see the response from the server and have to 
send it to the client.

>     All the three NIC cards are 10G Intel NIC card and we are using SFP
> cable.

Don't let the NICs being 10GbE distract you.  In broad handwaving terms, 
it takes just as many cycles to send or receive a frame over 10GbE as it 
does over 1 GbE as it did over 100BT as it did over 10BT.   Nothing in 
the IEEE specifications has changed that.

The only things that have brought-down overhead have been implementation 
specifics - such as interrupt avoidance just starting to happen with 
implementations of 100BT (typically in the form of not taking a transmit 
completion interrupt after every transmit) and then with 1 GbE .  Cards 
implementing 1 GbE is where we start to see  coalescing of receive 
interrupts, Checksum Offload (CKO) and TSO (TCP Segmentation Offload. 
Implementations of 10GbE NICs started giving us multi-queue and the 
occasional Large Receive Offload.  And CKO,TSO,LRO/GRO only help when 
one is sending more than an MTU/MSS-worth of data at a time.

However, through all of that, apart from gradually improving card 
programming models,  it still takes just as many cycles to send/receive 
a small packet as before.

rick jones

More information about the httperf mailing list