Personal computers have become much more reliable over the last 10 years or so, mostly due to the introduction of advanced operating systems with memory protection and hardware abstraction. The hardware itself has gotten better too; uncorrectable random errors are rare in PCs and extraordinarily rare in server-class systems.
These and other improvements have largely eliminated machine crashes. Blue-screen errors on Windows and kernel panics in Linux and Mac OS X still occur, but much more rarely.
Error-reporting services have become common, helping software developers figure out what went wrong. Most large developers now issue regular patches to fix newly discovered bugs, making systems more reliable between major releases.
All this progress is wonderful, of course, but our PCs still aren't reliable in the way that other consumer products are reliable. Machine crashes are still possible, and any bug can bring down an individual application.
Automobiles, for example, can fail in many ways, but they are still much more reliable than PCs. The risks associated with vehicle failures have been greatly reduced by decades of design refinements. Would you feel safe if PC technology controlled the steering and brakes in your car? Conversely, wouldn't you be more confident in your PC if you knew it was as reliable as your vehicle?
Can you rely on your system to display this 370-megapixel image?
(Credit: European Southern Observatory (ESO))PCs are also fragile in response to change. I know I'm always a little nervous the first time I install a new device driver or run a new application. Even without software changes, opening an unusually large image can induce some trepidation. Consider this 370-megapixel image of the Lagoon Nebula available from the European Southern Observatory Web site; how confident are you that all of your image-viewing programs would survive the attempt to open it?
And worst of all, PCs are fragile in response to attack. The kinds of problems that are sometimes created accidentally by software bugs are relatively easy to create on purpose.
Minimizing the frequency and consequences of these problems would require tremendous effort from everyone in the industry. Almost every bit of PC hardware and software would have to change. One part of the solution is an extension of the same techniques that make today's PCs more reliable than older models: more hardware-based isolation of one function from another.
The minimal isolation of today's systems is very convenient for software developers, making it easier to write code and achieve high levels of performance. More isolation means more complexity and more overhead, but it improves reliability.
Developers are taking the first steps in this direction already, for example, with the process isolation features of the Microsoft Internet Explorer 8 and Google Chrome browsers. But there's much more that can be done.
Another way to improve reliability is to verify that data and addresses are consistent in range and format with the original intent of the software developer before they are used by the program. Making these checks in software can help; the incidence of failures related to accidental and deliberate buffer-overflow conditions has been dramatically reduced in this way. There's plenty of room for new hardware to help in this process too.
There's also work to be done in making it easier to recover from failures, since true hardware failures are inevitable. This is another area where some high-end systems are way ahead of the PC. Fault-tolerant machine architectures have been around for a long time in the aerospace industry, for example.
Historically, fault tolerance has never been practical on the PC because PCs always had only one of each critical subsystem: one processor, one bank of memory, one display channel. Today, PC processors and graphics chips have multiple cores and multiple memory interfaces, creating the potential for redundant operation where it's most needed.
Recoverability also implies backups--not just of the contents of disk drives, but even of the live data in memory through checkpointing. And disk backups can be improved too, by making the backup process an integral part of all disk I/O. Modern file systems use journaling to increase reliability; this technique can be extended to allow recovering from errors long after they occur.
There will be a heavy price to be paid in complexity and performance for all of these techniques, but the currency for this payment is transistors, and Moore's Law gives us more of those in every new process generation. We need to consider how we want to allocate these transistors. Over time, I believe reliability should account for an increasing portion of them.
In part 1 and part 2 of this series, I claimed that there is apparently a secret rule in the microprocessor industry that determines the success--or failure--of new chip designs.
The failures included RISC processors, media processors, and intelligent RAM chips, which all sank in spite of clearly demonstrable advantages over alternative solutions. The great success is the programmable graphics processing unit (GPU), which has succeeded in spite of the sometimes wrenching shifts in programming methods and PC system architecture that have been required to support it.
So what's the secret? Simply this: a factor-of-two advantage, even if it's an inherent, persistent advantage, isn't enough to unseat an incumbent solution in the face of even the mildest competitive disadvantage. Without a factor of 10--a full order of magnitude--a new product won't even get a foot in the door.
That's why I call this rule the "factor factor." It isn't enough to be a few times faster than the existing alternatives. Given the performance consequences of Moore's Law, it's easier for your potential customers to wait a few years rather than spend a few years adapting to your "issues." You need be much faster than the products you're trying to replace. The target factor is 10--no less.
Sometimes, even a tenfold advantage isn't enough. One order of magnitude is enough to overcome one disadvantage, such as a change of programming methods. Add another simultaneous disadvantage, however, like the serious constraint in local memory capacity imposed by the IRAM concept, and the new technology may need a factor of 100 in performance to win a place in the market.
Overall, a new product must deliver net benefits amounting to as much as a full order of magnitude in cost, performance, or productivity to compensate for each significant disadvantage. That's just what it takes to motivate customers to deal with the problems rather than waiting for Moore's Law to speed up the solutions that are already familiar to them.
The introduction of the AMD64 instruction set by Advanced Micro Devices (also known as EM64T or "Intel 64" on Intel processors, or generically as x86-64) represents the ultimate success case for the factor factor.
AMD's Athlon 64 debuted the AMD64 instruction-set architecture.
(Credit: Advanced Micro Devices)This isn't immediately clear, I suppose. Adopting the AMD64 standard required a lot of work by operating system vendors and software developers, and the performance benefit was relatively mild in most cases. But still, AMD64 was an immediate success because the performance benefit in certain applications--those that simply wouldn't fit into a 32-bit address space--was practically infinite.
Although the factor factor seems obvious--or at least it should--it's still at the heart of many failed products and hundreds of millions of dollars of wasted investments every year.
In Silicon Valley, like other chip-design centers around the world, projects rarely fail because of poor execution. In most projects, the engineers are good at their jobs, the managers are good at coordinating their work, and the investment is sufficient to get the work done.
Most projects fail at the conceptual level, before the detail design work even begins. The factor factor is only one of many reasons for these failures, of course, but it's the one that disturbs me the most because it's the easiest to anticipate.
This rule doesn't apply to all products. When a new chip for an existing market is architecturally compatible with previous products, a factor-of-two performance improvement is plenty. Even smaller benefits can justify the costs of developing a new product if there are few, if any, disadvantages associated with it.
Multicore CPUs are one of these products, at least for now. Process technology makes it pretty easy to double core counts. Dual-core CPUs were almost a drop-in replacement for single-core chips and caused no serious problems. Quad-core chips were the same thing again. Eight-core CPUs may be a lesson in diminishing returns, but I'm sure they'll be commercially successful.
Beyond that, we'll have to see how it goes. The critical advantage of the CPU over the GPU is high performance on inherently serial processing tasks (what we sometimes call "single-threaded applications"). On a typical PC, there's rarely more than a few of these tasks running at any given moment. It's always useful to have a few extra cores available for parallel tasks, but at some point (I'm thinking somewhere around the 16-core level), PC buyers are likely to stop paying extra for more extra cores.
Even mighty Intel could find itself on the wrong side of the factor factor. Given that quad-core chips became a mainstream product just this year, we can expect to see 16-core processors for ordinary desktop PCs in 2013 and laptops in 2015 or so. By that time, the GPU could be the incumbent solution for high-performance parallel processing, and multicore CPUs could be the technology looking for compelling performance advantages.
So...now you know the supposed secret. When you hear about a radical new microprocessor architecture, you can do what I do: imagine the numeral "1" followed by a "0" for each drawback you see in the proposal. Compare that figure with the claimed benefits and you'll know which way to bet.
By the way, kudos to CNET users divisionbyzero and TrinityTrident, who proved my point that this rule isn't really a secret by explaining it on their comments to the previous posts in this three-part series.
Now if someone could only explain why so many companies don't seem to know this rule!
Listen carefully. I am about to reveal one of the great apparent secrets of the microprocessor industry. This secret largely determines whether new products succeed or fail.
I don't know why it seems to be a secret. It's simple enough. I figured it out early, in my first job in the industry, and I've seen it demonstrated over and over since then. I'm hardly the only one who knows this secret; I've seen dozens of talks that allude to it, and a few that mentioned it specifically. I've talked about it myself in articles I wrote for Microprocessor Report and other publications.
Unfortunately, I've also seen hundreds of products brought to market in apparent ignorance of this simple rule, and they've all failed, wasting the billions of dollars invested in their development. Assuming the developers weren't throwing away their money on purpose, I conclude they must not have known the one basic fact that doomed their projects, which means it must be a secret.
The secret is...... Read more
Ready for a 250-watt notebook? Intel is helping its OEMs to design such extremes.
A presentation at the Intel Developer Forum last week discussed how to build notebooks around the Core i7-920XM Extreme Edition mobile processor, code-named Clarksfield XE.
It turns out that when I estimated the maximum power consumption of a 920XM-based laptop at 80 watts to 100 watts, I was way off! (A typical notebook, by the way, averages somewhere between 40 and 90 watts.)
My estimate was reasonable for the kind of typical 920XM laptop I had in mind, but Intel showed how to go so far beyond "typical" that the resulting machine could need a 250-watt power brick.
I looked around, and the biggest power adapter I could find belongs to the Dell Alienware M17x, which needs a 210-watt brick. (I trust someone will tell me if there's a bigger one out there somewhere...Just leave a comment below.)
... Read more
Intel promotes the Turbo Boost technology in its new Core i7 Mobile processors as a way to adapt to the needs of the software and get more performance from the chip, but this isn't the real reason the technology exists.
The new "Clarksfield" Core i7 Mobile processors introduced at the Intel Developer Forum last week are certainly very impressive. They're huge high-performance quad-core chips with Hyper-Threading, support for two channels of DDR3-1333 DRAM, and an on-die PCI Express controller for the fastest possible connection to discrete graphics chips.
Intel VP Mooly Eden shows off the new Core i7 Mobile processor and its companion I/O controller at the Intel Developer Forum.
(Credit: Intel)In his IDF session announcing these parts, Intel Vice President Mooly Eden said the best of these parts, the 2GHz Core i7-920XM Extreme Edition, is "the fastest quad-core processor, the fastest dual-core processor, and the fastest single-core processor"-- all in one chip.
The key to this dramatic claim is a feature called Turbo Boost technology. Basically, if the current application workload isn't keeping all four cores fully busy and pushing right up against the chip's TDP (Thermal Design Power) limit, Turbo Boost can increase the clock speed of each core individually to get more performance out of the chip.
It's easy to see how this works when just one or two cores are being actively used; whatever power the other two or three cores would have consumed can be redirected over to the active cores, allowing them to run at higher speeds.
The quad-core mode of Turbo Boost is a little more subtle; it works when the four cores aren't running a worst-case workload--for example, integer-heavy processing, since it's generally floating-point calculations that consume the most power--so they aren't bumping into the TDP limit. Turbo Boost can increase the frequency of all four cores until they're running as fast as they can for the current workload.
Eden said that the Turbo Boost controller ... Read more
The mysteries of the Lynnfield and Jasper Forest die photos (from last week's post titled "Investigating Intel's Lynnfield mysteries") were all cleared up at the Intel Developer Forum last week, and as expected, there was nothing sinister going on--just some confusion in Intel's graphics arts department.
With the help of the always-helpful George Alfs of Intel's press relations department and Intel vice president Mooly Eden (general manager of Intel's PC Client Group), we got everything straightened out. Literally!
Here's the die photo of Intel's Lynnfield chip from my previous post:
Die photo of the Core i5/Core i7 processor code-named Lynnfield, with labels.
(Credit: Intel)This is the newest (shipping) part based on the Nehalem microarchitecture, differing from the earlier Bloomfield by the addition of an on-die PCI Express controller. Both chips are made in Intel's 45nm process technology.
According to Eden, the Lynnfield chip design is shared with several other Intel chips that will be on the market soon, including ... Read more
I have a few questions to ask at this week's Intel Developer Forum....
Why is Intel using a more expensive chip for the new Core i5 and cheaper Core i7 processors? Why does this new chip--code-named Lynnfield--appear to have features Intel isn't using? What's the connection between Lynnfield and a future Intel chip code-named Jasper Forest?
These questions arose as I've been getting ready for IDF by reviewing recent press releases and news stories about Intel's current and forthcoming products, and chatting with fellow analysts about what we're looking forward to seeing there.
The recent announcements of the Core i5 and new Core i7 processors seemed pretty straightforward. Consider Brooke Crothers' piece on CNET: "Out with the old: Intel makes Core 'i' chips cheap." As Crothers explains, the facts are simple: the new Core i7 800-series slots in under the existing 900-series and replaces some older parts. The Core i5 is a new line, clearly positioned below the Core i7. Features, performance, and prices are all lower. That's as it should be.
But in looking at the coverage on some enthusiast sites, a fact jumped out at me. The Lynnfield chip is 12.5 percent larger than the Bloomfield chip used in the higher-priced Core i7 900-series processors (296 square mm vs. 263 square mm), in spite of the fact that Lynnfield only has two memory interfaces and no QuickPath Interconnect (QPI) link.
The big difference between the chips is the addition of 16 lanes of PCI Express on Lynnfield, but that's only about 80 pins plus the control logic. The changes should have roughly canceled each other out. Maybe one chip would be a little bigger than the other, but not by this much.
... Read moreMuch has been made lately about the trend toward solid-state drives. Now a new Intel technology, code-named Braidwood, may delay that trend, blending the performance of solid-state drives with the economy of old-style hard drives.
Braidwood--like its predecessor, Intel's Turbo Memory technology (formerly code-named Robson)--is basically a solid-state cache for all the disks in the system.
I heard about Braidwood earlier this summer on CNET (see "Intel 'Braidwood' chip targets snappier software" by Brooke Crothers). But I shrugged it off, assuming it would be no better than Turbo Memory, which left a bad taste in the mouth of many PC makers, end users, and Microsoft execs. Turbo Memory (and Turbo Memory 2.0) wasn't cheap, and it definitely wasn't worth the cost. The PC industry operates on such slim margins that every dollar's worth of hardware has to earn its keep--and Robson didn't.
But then I read an EE Times article this week by Mark LePedus describing a new report from Jim Handy of analyst firm Objective Analysis.
The 62-page report is titled "Intel's Braidwood: Death to SSDs?"
Handy's report argues persuasively that Braidwood might actually be worthwhile, and that got my attention. I've known him a long time, and he's a very good analyst--he's been covering memory and caching technology a lot longer than I have. He wrote one of the standard references for computer system architects, "The Cache Memory Book."
So I sent Handy a note, and he sent me a copy of the report. And now that I've read it, I'm inclined to agree with his conclusions, assuming the information he's obtained about Braidwood is accurate. It does seem reasonable, at least.
The first thing to understand is why flash memory can be a good disk cache. This boils down to its much faster access times: microseconds, not milliseconds. Flash can actually take much longer to write than a hard disk. But for reads, it's really quick. So if you can be smart about putting the right hard-disk data in the cache, especially by choosing the right time to do those write operations, you can save huge amounts of time on future disk reads.
... Read more
How would you like a single-chip microprocessor with more than four times the performance (on some applications) of Intel's best Core i7?
Then consider that up to 32 of these chips can be directly connected to form a single server, achieving four times the built-in scalability of Intel's next-generation Nehalem-EX processor.
That's IBM's widely anticipated Power7, which it described at last week's Hot Chips conference. But if you're interested, you'd better be prepared to spend a lot more than four times as much per chip. IBM isn't talking about pricing, but large Power servers can cost more than $10,000 per processor.
IBM's forthcoming Power7 server processor has eight cores, manages 32 threads, and includes 32MB of on-chip embedded DRAM cache. Power7 also has the highest levels of off-chip bandwidth ever achieved by a microprocessor.
(Credit: IBM)What makes the Power7 so powerful? Each chip has eight cores, and each core supports four-way multithreading. There's 32MB of level-3 cache on the chip, made using embedded DRAM (eDRAM) cells. Most CPUs use SRAM for cache because it's generally easier to combine with high-performance logic, but DRAMs--with only one transistor per bit--offer compelling density advantages. IBM spent years developing a new kind of eDRAM that would work with SOI (silicon on insulator) manufacturing processes, and the Power7 is the most advanced product to use the new technology.
Interestingly, the Power7 cores run much more slowly than those in the Power6 processor, which I wrote about here in 2007 ("Live from Hot Chips 19: Session 1, IBM's Power6"). The Power6 was designed to run very fast using a long CPU pipeline in order to deliver the highest possible performance on each thread of execution.
Maybe that strategy didn't work out as well as IBM hoped, because the Power7 returns to a more traditional microarchitecture with a shorter pipeline and much lower clock rates--though IBM didn't say exactly what those rates would be.
IBM did, however, promise that the Power7 would be roughly four times as fast as the Power6, chip for chip. Since it has four times as many cores, each of the new slower-clocked cores must still deliver about as much performance as those in the previous generation.
Chip-level performance must always be matched by off-chip connections lest the incoming data or outgoing results be bottlenecked by a too-slow channel. Accordingly, the Power7 is equipped with eight I/O channels for DRAM, each of which connects to an off-chip buffering device that splits the channel into two 64-bit DRAM interfaces. All together, IBM says the Power7 has 180 GBps of DRAM interconnect that can sustain over 100 GBps of effective memory bandwidth.
There's another 50 GBps of peak I/O bandwidth and a staggering 360 GBps of peak bandwidth used to let each Power7 chip communicate with others. The DRAM connected to each chip is thus shared across larger systems.
Combining these figures, IBM says a single Power7 has 590 GBps of total off-chip bandwidth. This isn't the real number, since many of those bytes are used for error-correcting codes and other overhead, but it's still pretty impressive.
So is Power7's die size: 567 square millimeters for 1.2 billion transistors. That's nearly a square inch! IBM says that if the 32MB L3 cache had been manufactured using SRAM, the transistor count would have been 2.7 billion instead.
Still, Power7 wasn't the only high-end chip talked about at Hot Chips.
Rainbow Falls, a record for core count
Sun Microsystems was there to describe its forthcoming Rainbow Falls chip, which I assume will be marketed as the UltraSparc T3. The chip has 16 cores, each of which is reportedly able to manage 8 threads.
Sun's primary Rainbow Falls presentation focused on details of Rainbow Falls' internal and external interconnects; a second talk described the cryptographic coprocessors present in each of the chip's cores. These coprocessors--one for modular arithmetic (commonly used in public-key cryptography) and a cipher/hash unit to accelerate bulk ciphers like AES and secure hash algorithms--provide many times the performance of pure software implementations.
Fujitsu was also at Hot Chips to describe its eight-core, 2GHz Sparc64 VIIIfx processor, the latest in a long series of impressive designs from the company. Fujitsu quoted a peak performance figure of 128 GFLOPS (billions of floating-point operations per second) with a typical power consumption of just 58 watts. It did not, however, provide sustained performance or worst-case power consumption figures.
AMD, Intel vie for high-volume servers
Few of us will have direct exposure to the IBM, Sun, and Fujitsu chips. A pair of presentations from Advanced Micro Devices and Intel described products that will be much more widely available.
AMD launched its six-core Opteron processor code-named "Istanbul" earlier this year (see Brooke Crothers' coverage from June). Next year the company will begin shipping a new Opteron model currently code-named Magny-Cours (after a racetrack in France). Magny-Cours will consist of two Istanbul chips in a single package, with twice as many DRAM interfaces to support the new processor's increased performance.
AMD also teased the audience with another mention of a new processor core design that has been under development there for several years: "Bulldozer," which is now targeted at 32nm process technology. This new core will incorporate new x86 instruction-set extensions which will probably not be adopted by Intel (a strategy that reminds me of AMD's old 3DNow extensions).
But saving the best for last--best, that is, from the perspective of anticipated sales--Intel's talk on Nehalem-EX showed just how far Intel has been able to push the technology envelope for high-volume servers.
Nehalem-EX is an eight-core version of the existing quad-core Nehalem design. The new chip also has 24MB of L3 cache done in old-school SRAM. By my calculations, about 60 percent of the chip's 2.3 billion transistors are in this cache alone.
Nehalem provides four links to external DRAM buffer chips supporting two DDR3 DRAM interfaces each (much like the Power7 solution) and four QuickPath Interconnect links that provide direct "glueless" connections for up to eight-processor systems (64 cores, 128 threads). Intel is also working on an external Node Controller chip for systems with up to 2,048 Nehalem-EX processors.
The aggregate bandwidth numbers for Nehalem aren't as mind-boggling as those for Power7, but they're still far beyond anything available for PC-architecture servers today. Based on the presentation, I estimate Nehalem could boast over 85 GBps of peak memory bandwidth and 100 GBps of chip-to-chip bandwidth, some of which must be allocated to I/O.
I expect the raw number-crunching performance of the Nehalem-EX cores to be roughly on the same level as Power7's cores. The lower ratio of bandwidth to processing power for Nehalem-EX reflects a different design target, not a design shortfall--and most importantly, a much lower selling price. There will presumably be versions of Nehalem-EX priced similarly to existing Xeon MP products, which currently top out at $2,301 each in small volumes, but that's a very reasonable price to pay for the market's most advanced x86 server processor.
Last week, I attended a press event in Los Angeles hosted by Hewlett-Packard's workstation business unit. Hewlett-Packard was preparing for this week's announcement of three new Z-series workstation models: the Z400, Z600, and Z800.
HP briefed the reporters and analysts with all the key details of the products (the speeds and feeds, as we say), took us to visit a couple of HP's key customers in the area, and hosted presentations by software partners and more customers.
The new HP Z-Series workstations.
(Credit: Hewlett-Packard)The workstations are very nice, especially the Z600 and Z800: high-quality dual-processor systems based on Intel's newest Xeon 5500-series processors with specific adaptations to distinguish them from ordinary PCs. Even the Z400, though based on a more basic PC-like design, uses a single Xeon processor and provides two 16-lane PCI Express Gen2 slots.
The customer visits were well chosen: one at BMW Designworks and another at DreamWorks, the movie studio that just released Monsters vs. Aliens.
BMW Designworks actually assisted with the industrial design of the new HP workstations. They're handsome machines, but not exactly pretty--certainly not in the way Apple's Mac Pro is.
More importantly, however, the HP-BMW design is functionally superior. In about the same case size as the Mac Pro, HP's Z800 has room for more RAM, more expansion cards, and more disk drives. BMW also worked handles into the design, and they work better than Apple's.
The difference in RAM is quite substantial. It isn't just about the slots (eight in the Mac Pro, twelve in the Z800)--but even more in the fact that HP supports 16GB dual in-line memory modules (DIMMs), while Apple's machine goes only up to 4GB per slot. That's 192GB for the HP and 32GB for the Mac.
To be fair, HP is merely promising to offer 16GB DIMMs by the end of 2009; you can't get them today. Apple rarely preannounces anything, so it's possible that the Mac Pro will support more RAM by then, but HP's advantage in slot count should keep it on top.
More RAM can often give more performance than a faster CPU, especially in memory-hungry engineering applications. If the software overflows the physical memory and must start using virtual memory, performance can plummet.
These are very nice machines. But they're also expensive. The Z800 starts at less than $2,000 (actually a good bit cheaper than the Mac Pro's entry price), but most buyers will aim higher. In fact, it's no big deal to spend $10,000 or more on a high-end workstation.
Does that seem like a lot of money to spend on a PC for business use at a time when many businesses are struggling? Quite the opposite, I think.
The truth is, the cost of a superior PC is almost trivial, compared with the value it can generate in the hands of a highly skilled designer.
HP tried to make this point in its presentations at the event, but it was very conservative in its figures. First, it assumed that the total cost per employee (including salary, benefits, office space, management overhead, etc.) was just $60 per hour, which is very low. Second, it shouldn't have been using a cost model at all!
The more useful basis for this analysis is revenue per employee, which can easily exceed $250 per hour for the kind of workers who can make effective use of a high-price workstation.
For an employee generating this kind of value, a $10,000 workstation justifies its purchase remarkably quickly. Even if the employee's productivity improves just 10 percent, the payback period is a mere 10 weeks.
It's worth thinking about what it takes to generate a 10 percent improvement in overall productivity. It isn't just a matter of computer performance, but performance helps. These new HP workstations are much faster than the older models, due to the combination of the faster CPUs, faster and more RAM, and a new generation of professional graphics cards from Nvidia and Advanced Micro Devices' ATI.
Performance relates to productivity, in terms of how much time the user spends waiting for the computer, so that's what to look for. Assuming that the software is working as well as it can, and the user's work habits are reasonable, processing delays for engineering visualizations, animation previews, circuit simulations, and similar tasks can really add up.
So it's no surprise to me that there's still a market for pricey dual-processor workstations.
What does surprise me is that there aren't more companies trying to rebuild the market for super high-end workstations.
SGI, in its glory days, used to be able to sell some pretty amazing machines for professional users. I have an SGI Octane workstation that originally sold for over $50,000. That seems like crazy money, but even a $50,000 workstation in the right hands could still pay for itself in less than a year, a reasonable return on investment.
Alas, SGI went bankrupt again this week and then promptly sold itself to Rackable Systems for $25 million plus the assumption of SGI's debts.
I'm sad that SGI is gone, but it wasn't the workstation business that killed the company, and the numbers show that market niche still exists. HP could occupy that niche, if it chose, as could any company that makes four- and eight-processor servers, which share most of the same engineering issues.
Some small companies, such as Boxx Technologies (which I wrote about last summer in "Boxx fills in for a failing SGI") and HPC Systems, make bigger workstations, but both of these vendors' product lines are stuck with AMD Opteron processors at the moment, which are no longer performance-competitive with the new Xeons.
Later this year, new multiprocessor-capable Xeon processors will arrive that could reinvigorate the super-workstation market, and I hope that some of these companies step up to the challenge. I believe that there's some good money to be made there, and the rest of the world economy will benefit at the same time.






