August 19, 2007 9:00 PM PDT

Chipmakers aim to unclog data paths

PALO ALTO, Calif.--If you are going to build processors with large numbers of cores, argues Anant Agarwal, you have to figure out how to connect them to each other, too.

A decade of research into that problem has resulted in Tilera, Agarwal's company, which has invented a 64-core processor with an embedded high-speed network that can pass up to 32 terabits of data a second between the various cores.

The company's Tile64--designed for networking equipment and video streaming servers--can provide 10 times the performance of an Intel Xeon chip while consuming far less power, or 40 times the performance of a digital signal processor from Texas Instruments, the company says.

And 64 cores is just the start.

Agarwal and other executives from the company will discuss the architecture further on Monday at the Hot Chips conference here at Stanford University. Researchers at Intel, IBM, Advanced Micro Devices, and the the University of Texas, among others, also will present papers.

Promising chip companies with strong technical backgrounds rise and fade out on a regular basis in the semiconductor industry. Still, Tilera is trying to tackle one of the thorniest--and thus one of the potentially most lucrative--problems for computer designers today: slow, clogged data paths. Processor speed and transistor count has climbed at a rapid, steady pace for decades, but the buses and interconnects between them get upgraded at a much slower rate.

HyperTransport, found inside processors from AMD, has probably been the most significant achievement in this regard in the last decade. HyperTransport accounted for a substantial percentage of the performance gains AMD achieved with the Athlon chip.

"The fundamental limitation of CPUs is no longer (core) performance but I/O (input/output)," Andy Bechtolsheim said in a presentation to reporters in June on Sun Microsystems' efforts in supercomputing. "You don't get more I/O just because you shrink the manufacturing process."

Sun has been working on a technology called proximity communication that allows different chips to talk to each other without wires by virtue of just being close. It's not ready yet.

Last September, Intel's Justin Rattner unveiled Intel's proposed answer: an 80-core chip in which the cores are linked through an embedded network.

The Intel chip is conceptually similar to the Tile64, Agarwal said. Intel, though, has given itself five years to come out with 80-core chips.

Tilera has already delivered samples to customers and will start shipping chips commercially in the fourth quarter. It has 12 customers including networking gear manufacturers 3Com and TopLayer.

Intel's 80-core chip, however, also contains Through Silicon Vias, which unclog the processor-to-memory pathways. The Tile64 employs conventional memory controllers.

Under the hood
Tilera's chips consist of small, individual building blocks, or tiles. Each tile sports a RISC processing core that runs at 600MHz to 1GHz as well as a switch that can send data in four directions: up, down, right and left. These switches form a mesh network, called iMesh, that lets the chips communicate.

The mesh network itself is also divided up into five layers, depending on the type of transaction. One layer handles cache-to-cache transfers, while another handles streaming data.

Each tile contains two caches of memory for rapid data access. Although each tile contains its own cache, the tiles can access all of the cache (depending on how it's programmed).

Individual tiles consume a low 170 milliwatts to 300 milliwatts on average. Cores also power up and down independently when not in use to cut power consumption.

The size of the chip, and its ultimate performance, depend on how many tiles are included. The first product will contain 64 tiles and a 5MB distributed cache. Next year, the company says it will come out with a less expensive 36-tile version and then a 120-tile version close to, or in, 2009. Tiles on a single chip can be grouped into virtual processors assigned to different computing tasks.

Performance gains over conventional chips arise directly out of Tile64's design. A distributed network of slower processors can get jobs done quicker and with less overall energy than two or four larger, faster, more complex cores. Rather than powering a large bus, the chip can rely on shorter connections.

The Tile64 runs Linux and can be optimized for different applications.

See more CNET content tagged:
mesh networking, tile, chip company, HyperTransport, I/O


Join the conversation!
Add your comment
but what about quantum microstructures?
My dog just pooped
Posted by cybervigilante (529 comments )
Reply Link Flag
Future will be challenging
A whole new communication, data storage, and
software infrasture will need to be created
to handle the data bandwith.
Posted by grey_eminence (153 comments )
Reply Link Flag
Tip of the Iceberg
And all of this is essentially done in two dimensions! Think of what happens to processor density when we can grow up and down as well as side to side.

Now the burden falls to the software jocks. We have two generations worth of programmers who think sequentially, that one task follows another. There are a merry band of us who are forced to think in terms of multiple streams operating in real time. But even that abilty is a pale imitation of the ingenuity we are going to have to conjure up to take advantage of the huge gift (and challenge) the chip designers are giving us.

Just wait until processors are thought of in terms we now apply to memory -- "How many tera-procs are you running"?
Posted by TomMariner (762 comments )
Reply Link Flag

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot



RSS Feeds

Add headlines from CNET News to your homepage or feedreader.