[httperf] Benchmarking advice and thoughts

Yusuf Goolamabbas yusufg at outblaze.com
Sun May 27 20:07:49 PDT 2007


curl-loader is a load testing tool which uses libevent/curl

http://curl-loader.sourceforge.net/



> Here is a cross post from the openbsd mailing list.
> 
> Interesting advice.
> 
> After further investigating that really spiffy libevent library I am
> working on porting httperf to libevent.
> http://en.wikipedia.org/wiki/Libevent
> http://www.monkey.org/~provos/libevent/
> 
> This moves most of the portability issues away from httperf, onto the
> libevent library where they more appropriately belong.
> 
> Platforms that libevent has been ported to are: Linux, *BSD, Solaris,
> Mac OS X and unofficially HP-UX
> 
> Along with the improvements to portability, the use of this library
> provides a very clean mechanism to work around the FD limit.
> 
> Also, his comment on multi-homing is intruiging.  SCTP is the only major
> protocol that I know of that supports native mult-homing. If only just
> for the sake of keeping up with the times, I think that supporting SCTP
> in httperf is useful/important.... etc.
> 
> http://tinyurl.com/226lmd (SCTP paper from www2006 - XHTML)
> or
> http://www2006.org/programme/files/pdf/2035.pdf (Paper - pdf)
> http://www2006.org/programme/files/pdf/2035-slides.pdf (Slides - pdf)
> http://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol
> 
> And a podcast on the topic
> http://bsdtalk.blogspot.com/2007/03/bsdtalk102-cisco-distinguished-engineer.html
> 
> Thoughts?
> 
> -Ted
> 
> Artur Grabowski wrote:
> > Ted Bullock <tbullock at canada.com> writes:
> > 
> >> Theo de Raadt wrote:
> >>> One very important part of the hackathon sub-project will be to
> >>> improve 10Gb support.  Some of us believe that measuring the
> >>> performance of 10Gb networking later will help us spot some
> >>> performance problems that can improve 1Gb ethernet speed.
> >> As a side note, we recently released a new version of httperf that now
> >> builds cleanly on OpenBSD.  It can be used to measure web system
> >> performance (including throughput).
> >>
> >> My documentation on the tool is available here
> >> http://www.comlore.com/httperf
> >>
> >> The official website is available here
> >> http://www.hpl.hp.com/research/linux/httperf/
> >>
> >> Hope it can be of some sort of use in identifying performance bottlenecks.
> > 
> > I'm sorry to say this, but last time we used httperf to optimize the
> > image servers on our site (third most page views in Sweden, see [1]),
> > we found that httperf measures more the performance of the clients
> > running the benchmark rather than the performance of the servers.
> > 
> > We wrote our own tool that does what we think is the right thing. The
> > idea was to make things maximally bad for the server to see how it
> > scales under load, rather than just being nice and sending one request
> > at a time and see how it performs under optimal conditions which is
> > what httperf and most other do. When I'm back at the office (working
> > in another country right now), I'll try to get a permission from my
> > boss to release it under a free license. Until then I have a few tips
> > for you. I don't know how much httperf has evolved since then, so keep
> > that in mind, maybe some of the things are fixed and some of the issues
> > were with other benchmark tools we tested, so not everything needs to
> > apply to httperf.
> > 
> > 1. Don't do any memory allocations or "rendering" of the requests
> >    while you're doing the requests. Just allocate a huge blob of
> >    memory and make sure that all requests are ready to be pushed on
> >    the wire before you start connecting to the servers. Otherwise
> >    you'll measure the performance of malloc(), the VM system, the
> >    page zeroing algorithm in the kernel and printf.
> > 
> > 2. Use asynchronous connect(). You do that by setting O_NONBLOCK on
> >    the socket before connect(), then connect will return EINPROGRESS,
> >    then after you receive a write event from select() on the socket
> >    you can finish opening it. Otherwise you'll just do one connection
> >    at a time and that's not really realistic. Starting 1000 connects at
> >    the same time instead of waiting one round-trip for each connection
> >    does a _lot_ to the performance you get, both positively and
> >    negatively (depending on how hard you hit the server and how well
> >    it scales). Mixing connects with writes and reads does even more evil
> >    to the servers.
> > 
> > 3. Don't use select(). Select is slow and scales really badly. I
> >    suggest you simply use libevent since it provides a good
> >    abstraction for the various other good interfaces that do the same
> >    thing. On *BSD that would be kqueue(2), on linux that would be
> >    epoll(2).
> > 
> > 4. Don't use gettimeofday(). clock_gettime(CLOCK_MONOTONIC, ..) is
> >    both more accurate, doesn't suffer from problems of changing time
> >    and has higher resolution.
> > 
> > 5. Try to emulate slow links (this is something we didn't do in the
> >    tool, but rather in our test setup, since it was more realistic to
> >    have the ACKs delayed for real and real packet loss). The most
> >    horrible load on the servers we get is not during the normal peak
> >    hours on the days of traffic records, but rather during slow days
> >    during the summer when everyone has gone on vacations, sits in
> >    their summer house (vacations in Sweden are usually quite long and
> >    a lot people go to the country) connected with some crappy modem
> >    they have found in the attic and try to surf like usual, but on a
> >    link that's 100 times slower than what they have at home or at the
> >    office. This means lots of packet loss because of long phone lines
> >    going out to their summer house, lots of connection where the phone
> >    line drops, lots of pictures that they haven't bothered to finish
> >    downloading because it went slow and generally very long-lived
> >    connections. First time we studied this effect when we hit our
> >    capacity roof, we actually thought it was a DDoS.
> > 
> > 6. Add support for multi-homed clients. Some version of Linux had a
> >    really horrible hashing algorithm for the firewall (of course we
> >    had the firewall in the test setup) and all connections from one
> >    client ended up in the same bucket. Even worse, some firewalls
> >    do load balancing (of course we had load balancing in the test
> >    setup) based on the source address without looking at the port
> >    and all connections go for the same server.
> > 
> > 7. Do a DNS lookup for every connection (of course before you start
> >    making the connections) so that you get whatever DNS load balancing
> >    there is.
> > 
> > All http benchmarks out there get some or all of this wrong. If
> > someonee got this right, we wouldn't have had to write our own tool.
> > 
> > //art
> > 
> > [1] http://preview.tinyurl.com/2eqenj
> > .
> > 
> 
> -- 
> Theodore Bullock, <tbullock at canada.com, tedbullock at gmail.com>
> B.Sc Software Engineering
> Bike Across Canada Adventure http://www.comlore.com/bike
> _______________________________________________
> httperf mailing list
> httperf at linux.hpl.hp.com
> http://www.hpl.hp.com/hosted/linux/mail-archives/httperf/

-- 
Yusuf Goolamabbas
yusufg at outblaze.com


More information about the httperf mailing list