Re: Pipelining and compression effect on HTTP/1.1 proxies
Fred Douglis (douglis@research.att.com)
Tue, 22 Apr 1997 09:58:39 -0400
> I have done a quick test on the content of our proxy cache: for each
> directory, I have compared the output of
>
> cat * | wc
> and
> cat * | gzip | wc
>
> which is not a very rigorous test (since files in the cache contain the
> HTTP header as well, and merging files before compression changes
> the results a little bit) but gives the idea.
>
> The total byte count is as follows:
>
> Uncompressed: 316.407.346
> Compressed: 274.892.797
>
> with a saving, due to compression, of approximately 13% . I suspect the
> actual use of compression would result in lower performance since
> most files are short and headers compress a lot, thus biasing my result
> toward better performance. These results can be explained with the fact
> that large matherial is generally in compressed form at the source
> hence the additional compression is ineffective.
Another way to look at this is that not only is "large" textual data, such as
postscript, often compressed, but images are inherently compressed. Can you
tell us what fraction of files in your cache are content-type image/* (and the
like) as opposed to text?
In any case, I agree with your conclusion, in the sense that no matter what
the cause of the poor compression is, the end result is that compression will
only do so much.
An aside: does anyone know what the difference in compression will be between
cat * | gzip
and
for i in *; gzip $i ?
My guess is that by glomming everything together you are getting better
compression than you would in practice, when each file is compressed
distinctly, due to the adaptive algorithms -- here you may use data from file
X to do a better job compressing Y.
--
Fred Douglis MIME accepted douglis@research.att.com
AT&T Labs - Research 908 582-3633 (office)
600 Mountain Ave., Rm. 2B-105 908 582-3063 (fax)
Murray Hill, NJ 07974 http://www.research.att.com/~douglis/
As of 6/1/97:
AT&T Labs - Research
180 Park Ave, Room A181
Florham Park, NJ 07932-0971
973-360-8775 (office)
973-360-8871 (fax)