Grading the Windows-vs.-Unix Debate
It makes as little sense to compare a fully configured Sun 6900 against a dual Xeon server at 3.2 GHz as it does to compare the cost of service on a 10-year-old HP K-Class against that same dual Xeon running Linux. The thing to watch out for in these is carefully chosen interview subjects -- because it's easy to find clueless systems managers willing to say that Unix cost them more than Windows.
05/13/04 6:00 AM PT
My server has been getting a lot of hits lately from searches that look like the following:
184.108.40.206 - - [30/Apr/2004:13:44:24 -0400] "GET /article.html HTTP/1.1" 200 35654 "http://search.msn.com/results.aspx?q=windows+vs+unix+term+paper&FORM=SMCRT" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
On the one hand, I suppose this reflects increased academic interest in Unix. On the other, it does make me wonder about the marking keys used to judge a student's success in differentiating the two environments. As a public service, therefore, I thought I'd provide an overview for what to look for -- and what to avoid -- in Unix-vs.-Windows comparisons.
Because there are an awful lot of Internet-accessible documents that offer, or purport to offer, some contribution to this discussion, it's important to start with some basic rules for winnowing out the most obviously dishonest.
The single most common indicator of dishonest comparisons is the presence of important but unannounced anachronisms -- meaning that technologies, costs or ideas from one time frame are being compared to those from another as if they belonged together.
The most pervasive and subtle form of this is usually expressed in studies nominally reporting, with every appearance of utter objectivity, the result of a survey or other interview process.
IDC, for example, has issued a series of reports in which the authors are able to conclude that Unix (as Linux) is much cheaper than Unix (mainly as HP-UX and Solaris) by comparing the results obtained from two series of interviews: one among older IT managers whose cost experience reflects their use of RISC-Unix ERP servers bought in the early 1990s, and the other among technology-savvy enthusiasts mainly running Apache on x86-Linux.
In its less-subtle forms, the failure to maintain cotemporaneity between the sides being compared can lead to much more obvious absurdities. If, for example, you find someone denouncing Windows 2003 Server because it still can't even match 1975's System V Unix with respect to something as basic as manipulating a Bell 103 modem via a DEC DL-11W TTY controller, then you should forward the article to Information Week for stop-press publication but omit it from further consideration yourself.
The second largest pile of discardable studies are those whose conclusions derive from a single "big lie" -- an absurd assertion treated as unquestionable truth. This is the foundation, for example, of much of the third-party work offered by IBM in support of mainframe Linux with the pervasive belief engendered by years of exaggeration about mainframe performance setting the expectations that led to Microsoft's apparent embarrassment at having to call IBM on zVM performance [Paul Murphy, "Incredulity, Reverse Bias and Mainframe Linux," LinuxInsider, October 2, 2003].
The strength of this approach lies in the emotional commitment people have to maintaining the wrong, since they'll loudly support almost any conclusions built on such shared faiths.
The Brutally Naive
Sometimes, of course, this approach can be so egregious that only the most brutally naive are taken in. If, for example, you come across papers arguing that code reviewed in secret by a few people who make their money selling it is more likely to prove bulletproof in use than code reviewed by thousands of independent experts, you might want to ask Green Hills Software for a finders fee, but not waste time reading it yourself.
Once you drop analyses based either on big lies or on mismatched time frames, you can usefully group the remainder according to whether their primary focus is on differences in technology, cost or strategic consequences.
In many ways, the technology-focused comparisons are hardest to evaluate because you need to understand the histories and technologies on both sides of the discussion to decide what weight to give the author's arguments. This is particularly true, of course, for implementation issues where the same terms hardly ever mean the same things to both camps. For example, most of the comparative literature on kernel design is applicable to the Windows-vs.-Unix debate only if you can first track the changes in the meanings attributed to nominally well-understood terms like "thread" both over time and between camps.
Getting one level away from the detailed technology can alleviate this problem and allow some technology issues to be usefully discussed in the Windows-vs.-Linux context without too much risk of the vocabulary leading to false assumptions or misleading parallelisms.
For example, studies comparing the technical consequences of the single user, single process and usage assumptions built into Microsoft's use of the Windows Registry against those of the multi-user, multiprocessing assumptions behind Unix's use of startup scripts and user-accessible application-configuration files, can provide valuable insight to people making the decision for the first time to go with Linux or BSD versus Windows.
Similarly, comparing the security implications of decisions to use Unix applications versus competitive Windows software can be both topical and technically interesting while offering real contributions to the Microsoft-vs.-open-source decision that go beyond simple technical analyses. For example, comparisons pitting Microsoft's peremptory approach to change against the somewhat austere consistency Sun imposes on the Java application model -- or the generally joyous abandon with which the PHP development community flings around ideas and implementations -- can contribute on both fact and judgmental levels to a strategic tools decision.
It's very important, however, in reading such comparisons, to distinguish genuine applications issues from those affecting either the operating system or the hardware. On the software side, for example, the SANS Institute gets lots of publicity for its annual top 10 lists matching any Unix application vulnerability, whether that application also runs on Windows or not, against a sample of Microsoft operating system exploits.
Similarly, on the hardware side, the PC security industry spends a lot of time finding and advertising buffer overflows in Unix utilities or applications without ever informing the public that actual exploits depend on design weaknesses in the x86 CPU architecture -- and are therefore almost impossible on Sparc or PowerPC hardware -- which is why you read about a lot of such bugs being found but hardly ever about real Mac or Solaris users being affected by them.
Largest Group of Comparisons
The single largest group of Windows-vs.-Linux comparisons make their cases on cost. If you eschew contributions from the flat-Earth society -- which consists of consultants and others paid to find and present significantly distorted or one-sided cost "studies" -- you'll find the remainder can be grouped mainly by the depth or extent of their measures of cost [Paul Murphy, "Getting the Facts About Windows and Linux," LinuxInsider, February 5, 2004].
In the simplest cases, cost measures are restricted to the cost of acquisition, but even these are subject to serious distortions the reader needs to be aware of. It's possible, for example, to claim the marginal cost of a Windows Server 2003 license is zero to an academic institution with a site license and then compare that zero to the first copy of a fully supported Red Hat Enterprise suite at around US$4,000.
More commonly, however, such comparisons are included in "big lie" papers in which the authors do something like silently assuming the functional equivalence of a fully configured and supported Mac to a stripped and unsupported PC white-box to show that a $379 no-name without software (or a monitor) is less expensive than a $799 eMac with everything.
In more sophisticated cost studies, some lifetime operational costs are taken into account. These can be valuable if the cost information is supported and applies to products and services reasonably considered to be comparable in terms of both performance and cotemporaneity.
It makes as little sense, however, to compare a fully configured Sun 6900 against a dual Xeon server at 3.2 GHz as it does to compare the cost of service on a 10-year-old HP K-Class against that same dual Xeon running Linux. In general, the thing to watch out for in these is carefully chosen interview subjects -- because it's easy to find clueless systems managers willing to say that since Unix cost them more than Windows, it has to be generally more expensive.
What I consider the most sophisticated approach -- possibly because I've often tried to do this -- extends and combines both cost and technology arguments to look at the strategic consequences of the Windows and Unix decisions for entire organizations.
The trouble with these -- and thus the thing to be most cynical about when reading them -- is that they tend to reflect "gedanken experiments" because real measurements have never been done. In next week's column, for example, I plan to compare the management focus needed to succeed with the Unix business architecture (centralized servers, smart-display desktops, decentralized management and formal support for Linux and BSD at home) to that needed to succeed with the traditional data center, whether implemented with Wintel or the older mainframe technologies.
But you should be aware that the discussion is based on personal experience, not independent, quantified analysis of real operational data from a large number of organizations.
Paul Murphy, a LinuxInsider columnist, wrote and published The Unix Guide to Defenestration. Murphy is a 20-year veteran of the IT consulting industry, specializing in Unix and Unix-related management issues.