[httperf] A problem on fd-unavail while made a test with httperf

Martin Arlitt arlitt@granite.hpl.hp.com
Fri, 12 Dec 2003 07:27:07 -0800 (PST)


Wengui-Yang

there are a number of possible reasons why you do not have enough file
descriptors.  what I'd recommend you do first is use the '-v' option to
httperf.  this will print more information about the number of file
descriptors httperf has available to it.

for example, if I set the number of file descriptors to 100, then run
httperf with '-v' I get the following:

$ ulimit -n 100
$ ulimit -n
100
$ httperf -v
httperf --verbose --client=0/1 --server=localhost --port=80 --uri=/
--send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=1
httperf: maximum number of open descriptors = 100
Maximum connect burst length: 0

so just using '-v' will tell you exactly the number of file descriptors
available to httperf (it may not be the number you expect).

for example, it is possible the security mechanisms on your client do not
allow you to set the number of file descriptors to an arbitrarily large
value. for example, on my redhat 8 system, I had to modify
/etc/security/limits.conf to allow me to use 32768 file descriptors (this
is in addition to increasing the system wide values, as you are doing in
your script).  invoking ulimit -n for a value greater than the value I'm
allowed to use (and not the system wide setting) doesn't work.

$ grep arlitt /etc/security/limits.conf
arlitt           soft    nofile          32768
arlitt           hard    nofile          32768

(this is a new shell from the earlier example with 100 fds)
$ ulimit -n
32768

if I try to set it to a value about my limit, it doesn't work
$ ulimit -n 65000
bash: ulimit: open files: cannot modify limit: Operation not permitted

the system wide value on my machine is:
$ cat /proc/sys/fs/file-max
65536

another problem could be that httperf has been compiled with a limit that
is too small.  in particular, you need to be aware of the value of
FD_SETSIZE:

$ ulimit -n
32768
$ ./httperf -v
httperf --verbose --client=0/1 --server=localhost --port=80 --uri=/
--send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=1
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open
files to FD_SETSIZE
httperf: maximum number of open descriptors = 1024
Maximum connect burst length: 0

in this case, the limit is the value of FD_SETSIZE

in /usr/include/bits/typesizes.h (the actual file will vary by OS/version)
/* Number of descriptors that can fit in an `fd_set'.  */
#define __FD_SETSIZE            1024

change this to something like:

/* Number of descriptors that can fit in an `fd_set'.  */
#define __FD_SETSIZE            32768 /* 1024 */

then rebuild httperf:
$make clean
$make
$ ulimit -n
32768
$ ./httperf -v
httperf --verbose --client=0/1 --server=localhost --port=80 --uri=/
--send-buffer=4096 --recv-buffer=16384 --num-conns=1 --num-calls=1
httperf: maximum number of open descriptors = 32768
Maximum connect burst length: 0

one or both of these should solve your problem.

if you had to change FD_SETSIZE, you should change it back so that you do
not affect any other program that gets compiled on that machine.

also, I noticed in your script that you called 'ulimit -n' before setting
the system wide values.  you should change the order of this.

another thing you will need to be aware of is the number of connections in
a TIME_WAIT state.  if the connections are going into a TIME_WAIT state on
the client, you could still run out of ports, even though you have
increased the number that are available.  to avoid this you can enable TCP
time wait recycling (/proc/sys/net/ipv4/tcp_tw_recycle)

Martin


On Fri, 12 Dec 2003, Wengui-Yang wrote:

> httperf
>
> 	I made a web-server test with the below script, but the test
> result show that there are many errors due to fd-unavail even i
> issue 1200 connection per second at first test. In the script, I
> set some kernel parametres, but there are may be some error sets.
> Could you give me some advice on it.Thank you in advance!!! :-)
>
> The test is only to learn the usage of httperf, not for formal test,
> and the parametre set is to learn what value is appropriate.
> Additionaly, the client and web-server were connected with 100M hub.
>
> This is test script:
>
> #!/bin/sh
> #
> # The script was run on multiple machine
> # to test web system serve capability
> #
> if [ $# -ne 3 ]
> then
> echo "usage: $SCRIPT <web server> <timeout> <client - 0/n>"
> exit 1
> fi
>
> WWW=$1
> #RATE=$2
> TIMEOUT=$2
> CLIENT=$3
>
> #bump up resource limits on client machine
> echo 1048576 > /proc/sys/net/core/rmem_max
> echo 1048576 > /proc/sys/net/core/rmem_default
> echo 1048576 > /proc/sys/net/core/wmem_max
> echo 1048576 > /proc/sys/net/core/wmem_default
>
> ulimit -n 800000
>
> sysctl -w 'fs.file-max=800000'
> #sysctl -w 'fs.inode-max=32768'
> sysctl -w 'net.ipv4.ip_local_port_range=1024 65536'
>
> HTTPERF=/bin/httperf
> URI=/index.html
> HTTP="1.0"
>
> CMD="--server $WWW --port 80 --uri $URI --http-version $HTTP \
> 	--num-call 1 --client $CLIENT"
>
> echo "# $(hostname)############################################"
>
> touch  ./resultfile
>
> # inital the connect rate per second
>
> TIMES=0
> RATE=1000
> while [ $TIMES -le 45 ]
> do
> ((TIMES=$TIMES+1))
> ((RATE=$TIMES*200+1000))
> CONN=$(($RATE*240))
> echo "-------------------------------------------">>./resultfile
> #echo $(TIMES)>>./resultfile
> $HTTPERF $CMD --num-conns $CONN --rate=$RATE --timeout=$TIMEOUT >> ./resultfile
> sleep 240
> done
> echo ""
>
>
> This is part of test result:
> -------------------------------------------
>
> httperf --timeout=5 --client=0/1 --server=172.16.50.117 --port=80 --uri=/index.html --http-version=1.0 --rate=1200 --send-buffer=4096 --recv-buffer=16384 --num-conns=288000 --num-calls=1
> Maximum connect burst length: 3
>
> Total: connections 283217 requests 259644 replies 257340 test-duration 243.010 s
>
> Connection rate: 1165.5 conn/s (0.9 ms/conn, <=1022 concurrent connections)
> Connection time [ms]: min 1.0 avg 391.2 max 7741.5 median 118.5 stddev 905.8
> Connection time [ms]: connect 198.7
> Connection length [replies/conn]: 1.000
>
> Request rate: 1068.4 req/s (0.9 ms/req)
> Request size [B]: 74.0
>
> Reply rate [replies/s]: min 974.3 avg 1069.0 max 1108.4 stddev 36.7 (48 samples)
> Reply time [ms]: response 200.6 transfer 0.0
> Reply size [B]: header 327.0 content 2890.0 footer 0.0 (total 3217.0)
> Reply status: 1xx=0 2xx=257340 3xx=0 4xx=0 5xx=0
>
> CPU time [s]: user 9.02 system 234.00 (user 3.7% system 96.3% total 100.0%)
> Net I/O: 3404.1 KB/s (27.9*10^6 bps)
>
> Errors: total 30660 client-timo 25877 socket-timo 0 connrefused 0 connreset 0
> Errors: fd-unavail 4783 addrunavail 0 ftab-full 0 other 0
> -------------------------------------------
>
> httperf --timeout=5 --client=0/1 --server=172.16.50.117 --port=80 --uri=/index.html --http-version=1.0 --rate=1400 --send-buffer=4096 --recv-buffer=16384 --num-conns=336000 --num-calls=1
> Maximum connect burst length: 3
>
> Total: connections 286333 requests 259709 replies 256860 test-duration 242.978 s
>
> Connection rate: 1178.4 conn/s (0.8 ms/conn, <=1022 concurrent connections)
> Connection time [ms]: min 1.0 avg 369.1 max 7751.7 median 118.5 stddev 860.7
> Connection time [ms]: connect 164.8
> Connection length [replies/conn]: 1.000
>
> Request rate: 1068.9 req/s (0.9 ms/req)
> Request size [B]: 74.0
>
> Reply rate [replies/s]: min 972.0 avg 1067.1 max 1101.1 stddev 33.5 (48 samples)
> Reply time [ms]: response 209.2 transfer 0.0
> Reply size [B]: header 327.0 content 2890.0 footer 0.0 (total 3217.0)
> Reply status: 1xx=0 2xx=256860 3xx=0 4xx=0 5xx=0
>
> CPU time [s]: user 8.90 system 234.09 (user 3.7% system 96.3% total 100.0%)
> Net I/O: 3398.3 KB/s (27.8*10^6 bps)
>
> Errors: total 79140 client-timo 29473 socket-timo 0 connrefused 0 connreset 0
> Errors: fd-unavail 49667 addrunavail 0 ftab-full 0 other 0
> -------------------------------------------
>
> httperf --timeout=5 --client=0/1 --server=172.16.50.117 --port=80 --uri=/index.html --http-version=1.0 --rate=1600 --send-buffer=4096 --recv-buffer=16384 --num-conns=384000 --num-calls=1
> Maximum connect burst length: 4
>
> Total: connections 287373 requests 260468 replies 257428 test-duration 243.000 s
>
> Connection rate: 1182.6 conn/s (0.8 ms/conn, <=1022 concurrent connections)
> Connection time [ms]: min 1.0 avg 362.7 max 7973.8 median 118.5 stddev 848.0
> Connection time [ms]: connect 161.1
> Connection length [replies/conn]: 1.000
>
> Request rate: 1071.9 req/s (0.9 ms/req)
> Request size [B]: 74.0
>
> Reply rate [replies/s]: min 967.7 avg 1069.3 max 1104.4 stddev 32.4 (48 samples)
> Reply time [ms]: response 206.1 transfer 0.0
> Reply size [B]: header 327.0 content 2890.0 footer 0.0 (total 3217.0)
> Reply status: 1xx=0 2xx=257428 3xx=0 4xx=0 5xx=0
>
> CPU time [s]: user 9.44 system 233.56 (user 3.9% system 96.1% total 100.0%)
> Net I/O: 3405.6 KB/s (27.9*10^6 bps)
>
> Errors: total 126572 client-timo 29945 socket-timo 0 connrefused 0 connreset 0
> Errors: fd-unavail 96627 addrunavail 0 ftab-full 0 other 0
>
>
>        
>
>
>         Wengui-Yang
>         wgyang@mailst.xjtu.edu.cn
>           2003-12-12
>
>
>
> _______________________________________________
> httperf mailing list
> httperf@linux.hpl.hp.com
> http://linux.hpl.hp.com/cgi-bin/mailman/listinfo/httperf
>