Incredulity, Reverse Bias and Mainframe Linux

Both VeriTest (formerly ZDlabs) on Microsoft’s behalf and IBM have recently issued reports on running Ziff Davis Media’s NetBench performance benchmark on mainframe Linux. The results are not directly comparable because IBM used a dedicated 16-CPU z900, while VeriTest was restricted to a two-CPU partition on a z900.

Overall, however, it appears that Microsoft reports better mainframe performance than IBM does. The IBM z900 two-processor LPAR achieved 14 percent less performance at nearly 20 times the cost when compared with an ordinary server with two 900-MHz Intel Xeon processors running Windows Server 2003.

These results might seem surprising enough to those unfamiliar with mainframe Linux, but the truly astonishing thing here is Microsoft’s apparent reticence in presenting results that dramatically favor the Windows platform. Look at the report carefully, and you’ll see Microsoft suffering from fear, uncertainty and doubt as it confronts the reality of its own results.

Reports Themselves

Here are several key excerpts from the VeriTest report, with Mbps converted to MB/second to match IBM’s language:

  • “On the NetBench Enterprise DiskMix suite for testing file serving, the z900only achieved 68.25 MB/second maximum throughput compared to 79 MB/second maximum throughput the Windows server achieved in the VeriTest study. z/VM, which isrequired to run multiple virtual Linux servers on the mainframe, exerts aheavy penalty on mainframe performance for file serving.”
  • “Overall, the highest NetBench results for Linux on z/VM were 52.125 MB/secondthroughput with four Linux server images and sixty clients. Additionally,the z900 started generating read errors on the clients after twenty server images were reached, resulting in the benchmark software dropping clients. At twenty server images with 96 clients, the maximum throughput achieved was 36 MB/second.”
  • “At 96 server images, and 95 clients with one dropped client, the mainframeachieved 24.87 MB/second maximum throughput. This means that the maximum average throughput per server at 96 servers was only 0.26 MB/second, and that it would take one ofthese server images 38.62 seconds to serve one 10 MB file.”

    And here are excerpts from the IBM sizing analysis for the 16-CPU 64-GB machine:

  • “Between 50-70 concurrently active servers with an aggregate peak throughput of 105.8 MB/second could be supported. Testing with a number of concurrently active servers beyond 70 resulted in lower throughput and longer response times.”
  • “With a single GBE OSA card, up to 25 guests with one concurrent request each and an aggregate throughput of 13.37 MB/second could be supported. Maximum OSA card throughput was reached between 12-15 SMB processes.”
  • “With a single guest server and a single OSA card, it was possible to support up to 30 concurrent users at an aggregate throughput of 19.4 MB/second.”
  • “The cost in throughput between the 2.4.17 kernel in a native LPAR versus running the timer change version of this kernel on z/VM is most significant with small numbers of guests.”

    Cost and Performance

    The Microsoft-sponsored report has cost information. IBM’s does not. In summary, VeriTest’s estimates for the annual cost of operating the mainframe partition as benchmarked are based on numbers from Gartner and range from about US$252,000 to about $480,000, depending on factors like customer licensing and alternate workload.

    According to the report, these annual costs range from 10 to 20 times that of the more powerful 900-MHz dual Xeon machine running Windows Server 2003.

    According to VeriTest, the best throughput results obtained from the 2-CPU z900 were 68.25 MB/second with one Linux instance running without zVM and 52.12 MB/second with four Linux instances running with zVM. Scale this linearly to the full 16-CPU system, and you would predict that the machine should offer about 546 MB/second native and 420 MB/second with perhaps 16 Linux instances under zVM.

    IBM’s results are nowhere near that. Using a single guest server in an eight-CPU partition, they were able to achieve only about 19.4 MB/second on a single OSA/Express GigaBit Ethernet card that maxed out at 105.8 MB/second when running zVM on the full machine with “between 50 and 70” concurrent Linux instances and all 10 OSA/GBE boards.

    Scaling Is the Answer

    The z900s max out at 64 GB of memory. For the Microsoft configuration to scale up linearly, you would need three times the system maximum.

    Furthermore, the z900s used in the test have 24 bidirectional 1 Gbps ports on the main board, each of which connects to one or more external devices like disk controllers and the OSA/GBE boards. VeriTest used two dual-channel OSA/GBE cards to connect to the network and two fibre channel connectors running to a single 1.2 TB ESS/Shark array, while IBM used 10 OSA/GBE cards and 32 controllers on eight ports. For the Microsoft configuration to scale up linearly with unimpeded data flow on each device, you would need 32 ports — eight more than the system maximum.

    In effect, VeriTest got better than expected zVM/Linux results because the partition it used dramatically overstates the scalability of the machine. As a corollary, this also has the effect of seriously understating the cost of the fully scaled-up zVM system and thus hurts Microsoft’s case on both cost and performance.

    Blank or a Blink, Bill?

    There’s a more subtle question here. You don’t have to be unusually cynical to expect that a report paid for by Microsoft and executed by a spin-off from an organization founded on hyping Microsoft products would tend to favor Microsoft. Why, then, did this report seem to somewhat exaggerate the real-world viability of mainframe Linux?

    I have two ideas — one related to beliefs about how operating systems andhardware condition tester behavior and one related more specifically to thereport at hand. The results shown by Microsoft — both the performance advantage a dual 900-MHz Xeon with Windows 2003 Server has over the “Turbo” mainframe and the 20-to-1 cost advantage to Wintel — contradict the idea that enterprise-scale “big iron” must be as powerful as it is expensive.

    It is possible, therefore, to guess that VeriTest produced some initial results that were simply disbelieved, resulting in the generosity shown in the revised document asultimately published. An explanation based on a reverse bias arising from incredulity is supported by the fear, uncertainty and doubt that has to underlie this paragraph from thereport:

    The META Group, an independent consultancy, has audited the [test] plan, the facility, the tests and the final report. META Group was asked to verify that the benchmark configuration and procedures were appropriate but was not asked to endorse the results one way or the other. Also, neither VeriTest nor Ziff Davis Media, who provided the PC Magazine benchmark test suites, were approached about endorsing the results.Based on IBM’s marketing, expectations going in to the project were that mainframe Linux would produce results at the higher end of Windows server performance. The results turned out quite the opposite.

    Not only is this hesitancy uncharacteristic of Microsoft, but it’s totally missing from the report’s

  • Leave a Comment

    Please sign in to post or reply to a comment. New users create a free account.

    LinuxInsider Channels