September 8, 2006 4:00 AM PDT

Intel server revamp to follow AMD

Intel is getting ready to introduce a chip communications technology that mirrors an approach central to recent successes of rival Advanced Micro Devices.

If Intel's newly competitive chips recently brought to market act as the brains of a server, then the Common System Interface (CSI) is its nervous system. The technology, set for release in 2008, provides a new way for processors to communicate with each other and with the rest of a computer.

And alongside CSI, Intel plans to release an integrated memory controller, which is housed on the main processor rather than on a separate supporting chip. This will speed memory performance and so dovetail with the new communications system, the company expects.

Together, they could help Intel provide a much-needed counterpunch to AMD, which in 2003 introduced an integrated memory controller and a high-speed interconnect called HyperTransport in its Opteron and Athlon 64 processors. The two communication technologies, marketed together as "Direct Connect Architecture," deliver lower processor costs and chip performance advantages, which AMD has used to win a place in the designs of all of the big four server makers.

"Intel is hoping CSI will do for them in servers what 'CSI' did for CBS in ratings," said Insight 64 analyst Nathan Brookwood, referring to the hit TV series "CSI: Crime Scene Investigation."

Intel has been tight-lipped about CSI. However, Tom Kilroy, general manager of the company's Digital Enterprise Group, did confirm some details in a recent CNET News.com interview. Further glimpses have come from server makers, who are eager for CSI's debut in the "Tukwila" Itanium chip, due in 2008.

Tracking CSI
CSI brings two major changes. First, it will boost processor performance compared with Intel's current chip communication technology, the front-side bus.

"From a pure performance perspective, when we get to Tukwila and CSI, and we actually get some of the benefits of that protocol introduced into our systems, I think it's going to be really a big deal," said Rich Marcello, general manager of HP's Business Critical Server group.

CSI will be instrumental in helping double the performance of the Tukwila generation of servers, he noted.

Second, CSI will help Itanium server designers take advantage of mainstream Xeon server technology. Both chip families will use the interface, Kilroy said. That's particularly useful for companies such as Unisys, whose servers can use both processor types. It will make it possible for elements of a design to be used in both kinds of machine, reducing development costs and speeding development times.

"CSI allows us to continue to consolidate and standardize on fewer technologies," said Mark Feverston, Unisys' director of enterprise servers. "We can now go to a more common platform that allows us to build the same solutions in a more economical fashion."

CSI hasn't been easy to bring to market, though. In 2005, Intel dramatically altered the schedule for its introduction. Initially, the plan was for it to debut in 2007 with the Tukwila Itanium processor and the high-end "Whitefield" Xeon. But in October, Intel delayed Tukwila to 2008 and canceled Whitefield.

"Intel is hoping CSI will do for them in servers what 'CSI' did for CBS in ratings."
--Nathan Brookwood, analyst, Insight 64

Whitefield's replacement "Tigerton," and a sequel called "Dunnington," both use the front-side bus for communications. That means CSI won't arrive in high-end Xeons until 2009.

In the meantime, Intel has used other methods to compete with AMD--speeding up the front-side bus and building in large amounts of cache memory, for example.

"We've taken a different road, but down the road we'll end up getting an integrated memory controller and CSI in our platform," Kilroy said. "It's just a matter of priority for us."

Why add CSI?
Memory communication speed is a major factor in computer design today. In particular, its increasing performance sluggishness compared with processors is causing problems. To compensate, computer designers have put special high-speed memory, called "cache," directly on the processor.

But in multiprocessor systems, cache poses a problem. If one processor changes a cache memory entry, but that change isn't reflected in the main memory, there's a risk that another processor might retrieve out-of-date information from that main memory. To keep caches synchronized--a requirement called "cache coherency"--processors must keep abreast of changes other processors make.

With Intel's current designs, an extra chip called the chipset coordinates such communications between processors via the front-side bus. In contrast, with HyperTransport and CSI, the processors communicate directly with each other.

Intel also relies on the chipset to help with the communication between chips and the main memory. But technology such as CSI makes it easier for processors to communicate directly with memory. That's because one processor can quickly retrieve data stored in memory connected to another chip.

"The biggest advantage CSI offers is performance and the fact that you basically get a direct connection between the processors. That results in reduced latency between the processors," said Craig Church, Unisys's director of server development. The integrated memory controllers, too, will reduce latency, or communication delays, when a chip is fetching data from its own memory, he added.

CONTINUED: Improving memory power…
Page 1 | 2

See more CNET content tagged:
Intel server, Insight 64, Intel Itanium, Intel Xeon, Unisys Corp.

8 comments

Join the conversation!
Add your comment
Poor Cray
I think Jan Silverman's comments at the end of the article are a bit unwarrnted. AMD has done their fair share of following too.

Cray has a contract with AMD so they are not exactly a neutral source, and I know people tend to associate Cray with "supercomputers" but these days they hold only a 3.2% share (and falling) on the top500.org list. AMD systems make up 3/4's of their 3.2%.

Looking here: <a class="jive-link-external" href="http://top500.org/stats/27" target="_newWindow">http://top500.org/stats/27</a> under "proc family" you can see that Intel now powers &gt; 60% of the systems on the top500. list.

Cray has every reason to skew Intels image.
Posted by Dachi (797 comments )
Reply Link Flag
AMD used to follow then Intel had to follow...
AMD for the very reasons Silverman stated. Intel had a 32/64bit project but scrapped it because they "knew" we all wanted Itanic instead of x86.
It turned out that they just flat bet on the wrong horse.
They "knew" we all wanted RAMBUS RDIMMs so that's all they supported in their first P4 chipsets, OOPs they bet on the wrong horse again!
Even after AMD gained market share in the server area for multi-processor systems becuase they had Hypertransport, they knew that we want higher latency and wait states implementing their HUB architecture, OOPs it looks like they are finally admitting that throwing cache at the problem is not the solution.
Also the Intel spokesperson indicated that AMD didn't put more cache on their chips because they weren't at the same geometry yet, That's bunk it's because they didn't need as much cache because they weren't in as many wait states waiting for the bus arbiter to allow it to talk.

I applaud Intel for finally figuring out that their FSB HUB architecture was inferior in multi-processor systems, but just as was the case for the preceeding scenarios they are always late to the party.
This wasn't always the case but they got the "Not Invented Here" syndrome and if if wasn't then we (the consuming IT public) didn't need it. This is whay AMD gained share and Intel lost share.
Also Intel has sold more processors than AMD in all cases but the issue was not that, it was that AMD was gaining ground and Intel was not only starting to lose sales to AMD but the informed IT public knew it was a superior technology.
I will say that Intel has come a long way for their single processor systems with the Core architecture, good for them.

But throwing 16MB of cache at a server processor is admission in itself that their multi-processor architecture is nothing but a big bottleneck. What a waste of 65nM technology and die space. For the space they are putting that cache on they could probably put another two cores, L1 cache and arbitration logic.
Posted by fred dunn (793 comments )
Link Flag
big cache is still good
locality of reference is a good thing, but more difficult when the individual references consume more of the cache. To go to a 64-bit architecture with larger binaries and maintain similar levels of locality, the cache has to grow at the same rate as the binaries. Optimizations such as instruction bundling increase fetch sizes too, increasing the need for larger cache to maintain locality.

Additional hardware, cores, etc. is a brute force mechanism for improving perf that typically comes with increased production cost, lower yields (due to complexity) and higher power consumption.

Frankly, I can't believe AMD has limped along with tiny little caches as long as they have. Once they transition their register file completely, they'll have to make this change. When they do, their codegen will be simpler and their perf will improve.
Posted by Hardrada (359 comments )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.