McNealy To Bet the Company on 'Corona' CPU
Apr 1, 2004 6:00 AM PT
The standard chip-making process, based on using lithography to etch circuits in silicon and other materials, has been in use since the mid-1970s, with change expressed mainly in increased manufacturing precision as decreases in the wavelengths used allowed the development of ever smaller components.
Pundits, of course, have been predicting the end of this shrinkage process for several years, but so far no limit has been reached on the ability to realize Moore's Law by periodically doubling the density of components etched on the chip. Indeed, most of today's advanced CPUs are made at 90 nanometers, with a ramp-up to 65-nanometer production processes underway and laboratories now demonstrating successful 10-nanometer processes.
As early as 1987, however, work at IBM's Thomas Watson Laboratories in New York demonstrated the feasibility of the opposite approach: using ion deposition to build the chip up from a substrate instead of cutting it into a surface laid down on that substrate.
Moving to the Sphere
In theory, that technology can be used to replace the usual flat-mat design with a three-dimensional spherical design in which all flow distances are minimized. Such a sphere would be honeycombed with cooling tunnels and use a single, high-bandwidth, network-style connector threaded through the sphere in lieu of the traditional edge connectors.
Deposition methods now exist that can build a chip quite literally one atom at a time with a theoretical density increase in the range of six orders of magnitude relative to X-ray lithography -- about 1,000 times more than the increase in component density from 1979's 8086 processor to today's P4E2. Despite this potential, however, two factors, one technical and one commercial, have kept theory from becoming reality.
The commercial factor is fairly simple: It is expected to take at least 10 years and a very large number of dollars to build a production-scale plant capable of volume output. Because that planning horizon exceeds current product-generation life, costs are largely unknown, and proven alternatives exist, so no one has been willing to pioneer this technology for the realization of existing CPU designs despite the distance-cutting advantages of the spherical format.
The technical issue is far more complex and interlinked. The most difficult component has been that the Riemann equations describing interactions along the edges of the sub-nanoscale devices to be "grown" in this technology don't allow point solutions -- meaning that the information flows seem unpredictable mainly because, at that scale, quantum instability affects everything.
As a result, earlier designs have been limited to much larger, nanometer scale devices to which quantum considerations don't apply but which therefore also don't offer enough of a performance gain to justify the additional manufacturing complexity.
Four years ago, however, a Russian mathematician, Igor Dimitrovich Turicheskiy, working at the MV Keldysh Institute of Applied Mathematics in Moscow, provided a breakthrough solution when he showed that the apparent unpredictability of flow directions at these edges could be resolved through relativity theory.
Although I don't begin to understand the math, his work apparently explains the observation that information flows across a quantum scale device boundary generally don't exit the boundary in the same order in which they entered it -- the so-called chaotic flow limit to quantum computing that stopped IBM's Josephsen Junction effort -- is actually a predictable consequence of relativistic distortions of their apparent crossing time.
Thus, information about the order with which information flows are produced within such quantum assemblies, coupled with knowledge of the electrical properties of the medium, allows the complete prediction of its arrival pattern at another component.
This, of course, strikes at the heart of current CPU design limitations in which the time and voltage needed to drive electrons along internal connectors limits the physical size of the core and thereby gives rise to attempts, like Sun's throughput-computing initiative, to bypass some of those limits through the use of multiple parallel cores.
Unfortunately, SMP-style "throughput computing" has its own downside: Memory requirements increase as a function of throughput. For example, Sun's present top end, the 25EK, comes with up to 72 dual-core UltraSPARC-IV processors and needs up to half a terabyte of RAM to function efficiently.
For the "Jupiter" series planned around the future US-VI processor, that maximum will rise to 72 CPUs, with each one having up to 32 integrated cores -- giving the machine the estimated throughput equivalent of a three-terahertz US3 machine but requiring something like 16 terabytes of RAM. With current memory technology, such a machine would need more than a mile of memory sockets -- a clear impracticality even in an ultradense packaging environment such as those IBM plans for its bluegene series machines.
Turicheskiy's mathematics offers a nearly miraculous "double whammy" solution to this problem. Not only would atomic scale system assembly enable Sun to place a full terabyte of memory directly within each core, but the time dilation effect experienced by data moving across those boundaries at very nearly light speed offers gigahertz multiplication as an apparently "free" side effect.
On the Edge of Paradox
This balances the mathematics of quantum interchange on the edge of paradox with a clock that registers 1 GHz internally appearing to run at about 12.5 GHz when viewed from outside to provide a full order of magnitude in apparently "free" throughput improvement.
Of course, in reality, physics does not allow for free energy, and such a machine would come to a stop about 12 seconds after start-up if the designers didn't provide an additional energy source. In this case, the free electrons needed will come from the use of a liquid superconductor as both coolant and network bus.
Circulated through the spherical CPU using pressure generated by a nanoscale sterling engine powered by waste heat, this material remains electrically continuous and offers nearly infinite bandwidth as it flows through both the CPU and its external connectors to disk and network resources.
Sun Moves Toward Corona
At present, no one really knows what it will cost or how long it will take to develop the machines that will build the machines that will make these kinds of systems. Certainly DARPA's initial US$50 million contribution barely accounted for the cost of the design software needed for three-dimensional component layout.
The buzz among Sun board members is, nevertheless, that chairman and CEO Scott McNealy will ask the board to bet the company on this technology by approving what amounts to a blank check to partners Fujitsu and Texas Instruments for an attempt to build a laboratory prototype for a future Sun Chronosphere CPU series.
If, as seems unlikely given today's date, this rumor proves to be true, the effort will mark an enormous gamble for Sun and its always-mercurial chairman. But it could give the company control of the entire computing universe for eons to come.
Paul Murphy, a LinuxInsider columnist, wrote and published The Unix Guide to Defenestration. Murphy is a 20-year veteran of the IT consulting industry, specializing in Unix and Unix-related management issues.