[httperf] Benchmarking advice and thoughts
yusufg at outblaze.com
Sun May 27 20:07:49 PDT 2007
curl-loader is a load testing tool which uses libevent/curl
> Here is a cross post from the openbsd mailing list.
> Interesting advice.
> After further investigating that really spiffy libevent library I am
> working on porting httperf to libevent.
> This moves most of the portability issues away from httperf, onto the
> libevent library where they more appropriately belong.
> Platforms that libevent has been ported to are: Linux, *BSD, Solaris,
> Mac OS X and unofficially HP-UX
> Along with the improvements to portability, the use of this library
> provides a very clean mechanism to work around the FD limit.
> Also, his comment on multi-homing is intruiging. SCTP is the only major
> protocol that I know of that supports native mult-homing. If only just
> for the sake of keeping up with the times, I think that supporting SCTP
> in httperf is useful/important.... etc.
> http://tinyurl.com/226lmd (SCTP paper from www2006 - XHTML)
> http://www2006.org/programme/files/pdf/2035.pdf (Paper - pdf)
> http://www2006.org/programme/files/pdf/2035-slides.pdf (Slides - pdf)
> And a podcast on the topic
> Artur Grabowski wrote:
> > Ted Bullock <tbullock at canada.com> writes:
> >> Theo de Raadt wrote:
> >>> One very important part of the hackathon sub-project will be to
> >>> improve 10Gb support. Some of us believe that measuring the
> >>> performance of 10Gb networking later will help us spot some
> >>> performance problems that can improve 1Gb ethernet speed.
> >> As a side note, we recently released a new version of httperf that now
> >> builds cleanly on OpenBSD. It can be used to measure web system
> >> performance (including throughput).
> >> My documentation on the tool is available here
> >> http://www.comlore.com/httperf
> >> The official website is available here
> >> http://www.hpl.hp.com/research/linux/httperf/
> >> Hope it can be of some sort of use in identifying performance bottlenecks.
> > I'm sorry to say this, but last time we used httperf to optimize the
> > image servers on our site (third most page views in Sweden, see ),
> > we found that httperf measures more the performance of the clients
> > running the benchmark rather than the performance of the servers.
> > We wrote our own tool that does what we think is the right thing. The
> > idea was to make things maximally bad for the server to see how it
> > scales under load, rather than just being nice and sending one request
> > at a time and see how it performs under optimal conditions which is
> > what httperf and most other do. When I'm back at the office (working
> > in another country right now), I'll try to get a permission from my
> > boss to release it under a free license. Until then I have a few tips
> > for you. I don't know how much httperf has evolved since then, so keep
> > that in mind, maybe some of the things are fixed and some of the issues
> > were with other benchmark tools we tested, so not everything needs to
> > apply to httperf.
> > 1. Don't do any memory allocations or "rendering" of the requests
> > while you're doing the requests. Just allocate a huge blob of
> > memory and make sure that all requests are ready to be pushed on
> > the wire before you start connecting to the servers. Otherwise
> > you'll measure the performance of malloc(), the VM system, the
> > page zeroing algorithm in the kernel and printf.
> > 2. Use asynchronous connect(). You do that by setting O_NONBLOCK on
> > the socket before connect(), then connect will return EINPROGRESS,
> > then after you receive a write event from select() on the socket
> > you can finish opening it. Otherwise you'll just do one connection
> > at a time and that's not really realistic. Starting 1000 connects at
> > the same time instead of waiting one round-trip for each connection
> > does a _lot_ to the performance you get, both positively and
> > negatively (depending on how hard you hit the server and how well
> > it scales). Mixing connects with writes and reads does even more evil
> > to the servers.
> > 3. Don't use select(). Select is slow and scales really badly. I
> > suggest you simply use libevent since it provides a good
> > abstraction for the various other good interfaces that do the same
> > thing. On *BSD that would be kqueue(2), on linux that would be
> > epoll(2).
> > 4. Don't use gettimeofday(). clock_gettime(CLOCK_MONOTONIC, ..) is
> > both more accurate, doesn't suffer from problems of changing time
> > and has higher resolution.
> > 5. Try to emulate slow links (this is something we didn't do in the
> > tool, but rather in our test setup, since it was more realistic to
> > have the ACKs delayed for real and real packet loss). The most
> > horrible load on the servers we get is not during the normal peak
> > hours on the days of traffic records, but rather during slow days
> > during the summer when everyone has gone on vacations, sits in
> > their summer house (vacations in Sweden are usually quite long and
> > a lot people go to the country) connected with some crappy modem
> > they have found in the attic and try to surf like usual, but on a
> > link that's 100 times slower than what they have at home or at the
> > office. This means lots of packet loss because of long phone lines
> > going out to their summer house, lots of connection where the phone
> > line drops, lots of pictures that they haven't bothered to finish
> > downloading because it went slow and generally very long-lived
> > connections. First time we studied this effect when we hit our
> > capacity roof, we actually thought it was a DDoS.
> > 6. Add support for multi-homed clients. Some version of Linux had a
> > really horrible hashing algorithm for the firewall (of course we
> > had the firewall in the test setup) and all connections from one
> > client ended up in the same bucket. Even worse, some firewalls
> > do load balancing (of course we had load balancing in the test
> > setup) based on the source address without looking at the port
> > and all connections go for the same server.
> > 7. Do a DNS lookup for every connection (of course before you start
> > making the connections) so that you get whatever DNS load balancing
> > there is.
> > All http benchmarks out there get some or all of this wrong. If
> > someonee got this right, we wouldn't have had to write our own tool.
> > //art
> >  http://preview.tinyurl.com/2eqenj
> > .
> Theodore Bullock, <tbullock at canada.com, tedbullock at gmail.com>
> B.Sc Software Engineering
> Bike Across Canada Adventure http://www.comlore.com/bike
> httperf mailing list
> httperf at linux.hpl.hp.com
yusufg at outblaze.com
More information about the httperf