Microsoft Advertising for Linux?
"Disraeli was pretty close: actually, there are Lies, Damn lies, Statistics, Benchmarks, and Delivery dates."
(from fortune)
The recent tests done at ZDLabs turned up some interesting results.
They were of course presented under the auspices of Windows NT is faster than
Linux, which was, strictly speaking, true. Now, it doesn't really matter
what the testing software was, or what the testing
hardware was. I
don't really care, for the moment at least, how honest the test was.
I expect that it was at least somewhat honest since
some RedHat people were
on the scene. I'm interested in how we interpret the results.
Now, there is the face value of the results,
that is, that windows NT is faster than Linux, thus better,
and hence that in any given situation, NT is better to use than Linux.
There's also the option, of course, to actually look at what the tests found.
What are some of the actual facts that the tests came up with?
Here are some important ones that I found (pretty color graphs aside):
- With 4 CPUs and 1 Gig of RAM, NT & IIS achieved 4,166 http requests per second.
- With 4 CPUs and 1 Gig of RAM, Linux & Apache achieved 1,842 http requests per second.
- With 1 CPU and 256 MB RAM, NT & IIS achieved 1,863 http requests per second.
- With 1 CPU and 256 MB RAM, Linux & Apache achieved 1,314 http requests per second.
- note: when I refer to a single CPU linux box or a 4 CPU linux box, I will be talking
about the box that ZD used (i.e. whatever processor speed etc. that it had).
Linux looks pretty slow, doesn't it?
Who would use it for any real application?
Well, let's examine this situation a bit more than just comparatively.
First off, let's just look at an approximation of the situation that this represents:
- 1,842 hits/sec * 3600 sec/hour * 24 hours/day = 159,148,800 hits/day.
- 1,314 hits/sec * 3600 sec/hour * 24 hours/day = 113,529,600 hits/day.
So Linux/Apache should be able to
handle your site on a 4 CPU 1 Gig RAM box if you get 159
million hits per day or less. If you get only a measly 113 million
hits/day, then a single CPU box with 256 meg of RAM should
be able to host your site. Of course, this only works if
your access is 100% even which is extremely unrealistic.
Let's assume that your busy times get ten times more hits
per second than your average hits/second. That means that a
single CPU Linux box with 256 meg of RAM should work for you
if you get about 11 million hits every day. Heck, let's be
more conservative. Let's say that your busy times get 100 times
more hits/second than your average hits/second. That means that
if you get 1.1 million hits per day or less, that same box will
serve your site just fine.
OK, there's that way of looking at it,
but it's not really a good way. It's a very coarse approximation
of access patterns and what a site needs.
Let's try another way of looking at this.
Let's do some simple calculations
to see what sort of bandwidth these numbers mean. Bandwidth will be a
better and more constant method of determining who these numbers apply
to than guessed at hit ratios.
The ZDNet page said that the
files served were of "varying sizes", so we'll have to make some
assumptions about the average size of the files being
served. Since over 1000 files were served per second
in all of the tests, it's pretty safe to work by averages. Some numbers:
- 1,842 hits/sec * 1 kilobyte/hit * 8192 bits/kilobyte = 15089664 bits/sec = 15 MBits/sec.
- 1,842 hits/sec * 2 kilobytes/hit * 8192 bits/kilobyte = 30179328 bits/sec = 30 MBits/sec.
- 1,842 hits/sec * 5 kilobytes/hit * 8192 bits/kilobyte = 75448320 bits/sec = 75 MBits/sec.
- 1,842 hits/sec * 10 kilobytes/hit * 8192 bits/kilobyte = 150896640 bits/sec = 150 MBits/sec.
- 1,842 hits/sec * 25 kilobytes/hit * 8192 bits/kilobyte = 377241600 bits/sec = 377 MBits/sec.
- 1,314 hits/sec * 1 kilobyte/hit * 8192 bits/kilobyte = 10764288 bits/sec = 10 MBits/sec.
- 1,314 hits/sec * 2 kilobytes/hit * 8192 bits/kilobyte = 21528576 bits/sec = 21 MBits/sec.
- 1,314 hits/sec * 5 kilobytes/hit * 8192 bits/kilobyte = 53821440 bits/sec = 53 MBits/sec.
- 1,314 hits/sec * 10 kilobytes/hit * 8192 bits/kilobyte = 107642880 bits/sec = 107 MBits/sec.
- 1,314 hits/sec * 25 kilobytes/hit * 8192 bits/kilobyte = 269107200 bits/sec = 269 MBits/sec.
Just as a reference, a T1 line
is worth approximately 1.5 MBits/sec, these numbers don't include
TCP/IP & HTTP overhead, and this document is approximately 12k.
Now, what does this tell us?
Well, that if you are serving up 1,314 pages per second where
the average page is only 1 kilobyte, you'll
be needing 10 T1 lines or the equivalent until the computer is
the limiting factor. What site on earth is going to be getting
a sustained >1000 hits per second for 1 kilobyte files? Certainly
not one with any graphics in it. Let's assume that you're running
a site with graphics in it and that you're average file is 5 kilobytes -
not too conservative or too liberal. This means that if you're
serving up 1,314 of them a second, you'll need 53 MBits of bandwidth.
And there are no peak issues here, you can't peak out more than your bandwidth.
Let's go at it another way, this time starting with our available bandwidth:
- 1 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/kilobyte = 184 hits/sec.
- 1 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/2 kilobytes = 92 hits/sec.
- 1 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/5 kilobytes = 37 hits/sec.
- 1 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/10 kilobytes = 19 hits/sec.
- 1 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/25 kilobytes = 8 hits/sec.
- 5 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/kilobyte = 916 hits/sec.
- 5 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/2 kilobytes = 458 hits/sec.
- 5 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/5 kilobytes = 183 hits/sec.
- 5 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/10 kilobytes = 92 hits/sec.
- 5 T1 Line * 1.5 MBits/T1 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/25 kilobytes = 36 hits/sec.
- 1 T3 Line * 45 MBits/T3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/kilobyte = 5,494 hits/sec.
- 1 T3 Line * 45 MBits/T3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/2 kilobytes = 2747 hits/sec.
- 1 T3 Line * 45 MBits/T3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/5 kilobytes = 1099 hits/sec.
- 1 T3 Line * 45 MBits/T3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/10 kilobytes = 550 hits/sec.
- 1 T3 Line * 45 MBits/T3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/25 kilobytes = 220 hits/sec.
- 1 OC3 Line * 155 MBits/OC3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/kilobyte = 18,921 hits/sec.
- 1 OC3 Line * 155 MBits/OC3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/2 kilobytes = 9461 hits/sec.
- 1 OC3 Line * 155 MBits/OC3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/5 kilobytes = 3785 hits/sec.
- 1 OC3 Line * 155 MBits/OC3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/10 kilobytes = 1,893 hits/sec.
- 1 OC3 Line * 155 MBits/OC3 * 1,000,000 bits/MBit * 1 kilobyte/8192 bits * 1 hit/25 kilobytes = 757 hits/sec.
note: these numbers don't include TCP/IP or HTTP overhead.
I am assuming that the tests that
ZD made were meant to mean something, so I won't entertain the idea
that they used an average file size of less that 1K. Given that,
It is clear that the numbers that ZD's tests produced are only
significant when you have the equivalent bandwidth of over 6 T1 lines.
Let's be clear about this: if you have only 5 T1 lines
or less, a single CPU Linux box with 256 MB RAM will wait on
your internet connection and not be able to serve up to
its full potential. Let me reemphasize this: ZD's tests prove that
a single CPU Linux box with 256 MB RAM running apache will run
faster than your internet connection!. Put another way,
if your site runs on 5 T1 lines or less, a single CPU Linux box with
256 MB RAM will more than fulfill your needs with CPU cycles left over.
What was just if the ZD numbers were
valid for files of only 1K in size. Let's make an assumption that you either
(a) have pages with more than about a screen of text or
(b)black and white pictures that make your average file size 5K and
that ZD's tests accurately reflect this condition. Given this,
ZD's tests would indicate that a single CPU Linux box with only
256 MB RAM running Apache would be constantly waiting on
your T3 line. In other words, a single CPU Linux box with
256 MB RAM will serve your needs with room to grow
if your site is served by a T3 line or less.
One might also conclude that if you serve
things like color pictures (other than small buttons and doodads) and thus
your average file size is 25K, a single CPU Linux box with 256 MB RAM will
serve your site just fine even if you are served by an OC3 line that you
have all to your self. I personally wouldn't bet that ZD's tests used
such large average file sizes, though. It was a benchmark, after all.
So far, I have been addressing only internet
based web serving. I'm not really going to address intranets, but let me
ask you this, is it a good sign if your intranet is getting a sustained
1,300 hits per second? Assuming that each page view accounts for 10 hits
(the associated pictures, etc.) and that no employee views more than 1 page every
five seconds, that means that over any given five second interval, 650 of your
employees have viewed one of the pages on your intranet webserver. If this is really
the case and is some sort of sustained behavior, why are 650 of your employees looking
at your intranet webserver at any given 5 second point in time? Aren't they supposed
to be doing some more productive than surfing your intranet?
Now, I'm not saying that this is never going to happen,
but if you have the number of employees to generate that sort of hit count are you
really going to have them all on the same LAN that they are going to be all going to the
same intranet webserver?
So far I haven't mentioned windows NT at all.
I mean this paper to be a piece of Linux advocacy and also a bit of realism.
Basic math skills can often be helpful in cutting through hype, which I think
has been shown so far. But I can't resist one last piece of information
that puts these tests in a different light than damning to Linux and Apache.
If the ZD numbers are to be
believed about NT's performance, and I see no reason to disbelieve them, the NT
server that ZD tested should be able to serve 359.9 million hits per day. According to
http://www.microsoft.com/backstage/bkst_cs_supportonline.htm
Microsoft Support Online gets approximately 2.3 million page views a day.
Even supposing that each of those page views generates 100 hits each, that would
mean that Microsoft Support Online only gets 230 million hits per day, far under what
the tested NT server can do. Theoretically, just one NT server like the one that ZD tested
should be able to handle this load.
Microsoft uses 6.
What's that, I hear someone scream? But Microsoft Support Online
involves dynamic content? Well, the ZD test was only about static content. I'm so glad to know that it was
relevant to the real world. Aren't you?
Appendix 1
Since I wrote this article, I received the following email message:
Here is a useful number that may help your now-famous web page out:
The mean file size in ZDNet's WebBench (you can download the benchmark
from their site, if you wish) 10342.3 bytes, or 10kb.
In other words, the "lesser" Linux box can handle a bandwidth of 107 Mbps,
and the quad box can handle a bandwidth of 150 Mbps.
- Sam (kiwi-6hu34ln@koala.samiam.org)
added @ 3:38 EST jun 28, 1999.
Christopher Lansdown
lansdoct@cs.alfred.edu