[httperf] httperf: connection failed with unexpected error 0
& low performance
rick.jones2 at hp.com
Fri Oct 19 11:47:40 PDT 2012
On 10/19/2012 11:19 AM, Vikash Kumar wrote:
> Hi Rick,
> System parameters are set up to optimal limit like fd's in server
> and client are raised to 400000
> Local port range is also much higher. tcp fin-timeout has been
> reduced to 30 secs etc. After setting all these things we are getting
> higher no. of connection with intel back to back. TCP recycle and reuse
> options are also 1.
The highest the local port range can be is gated by it being a 16-bit
> Point is after the setup of 3 node topology and requesting
> connection via haproxy, performance is dropping drastically.
Assuming all your network statistics are "clean" then next I would look
to see if any one of the CPUs on the system on which the haproxy is
running are saturated. In broad, handwaving terms, your haproxy is
doing twice as much work as either the client or the server.
The client and server will see every packet once basically. The haproxy
(unless it is caching) will see everything twice. It will have to
accept the connection from the client, then establish a connection to
the server. It will see the request from the client and have to send it
to the server. It will see the response from the server and have to
send it to the client.
> All the three NIC cards are 10G Intel NIC card and we are using SFP
Don't let the NICs being 10GbE distract you. In broad handwaving terms,
it takes just as many cycles to send or receive a frame over 10GbE as it
does over 1 GbE as it did over 100BT as it did over 10BT. Nothing in
the IEEE specifications has changed that.
The only things that have brought-down overhead have been implementation
specifics - such as interrupt avoidance just starting to happen with
implementations of 100BT (typically in the form of not taking a transmit
completion interrupt after every transmit) and then with 1 GbE . Cards
implementing 1 GbE is where we start to see coalescing of receive
interrupts, Checksum Offload (CKO) and TSO (TCP Segmentation Offload.
Implementations of 10GbE NICs started giving us multi-queue and the
occasional Large Receive Offload. And CKO,TSO,LRO/GRO only help when
one is sending more than an MTU/MSS-worth of data at a time.
However, through all of that, apart from gradually improving card
programming models, it still takes just as many cycles to send/receive
a small packet as before.
More information about the httperf