August 19, 2007 9:00 PM PDT
Chipmakers aim to unclog data paths
- Related Stories
Sun eyes supercomputing gloryJune 25, 2007
Intel readies massive multicore processorsJune 14, 2007
IBM connects chips for better bandwidthApril 11, 2007
Intel pledges 80 cores in five yearsSeptember 26, 2006
Sun chips away at wireless chip connectionsAugust 2, 2004
- Related Blogs
U. Texas to unveil its supercomputer on a chip
April 24, 2007
AMD maps out server plans for next two years
July 26, 2007
A decade of research into that problem has resulted in Tilera, Agarwal's company, which has invented a 64-core processor with an embedded high-speed network that can pass up to 32 terabits of data a second between the various cores.
The company's Tile64--designed for networking equipment and video streaming servers--can provide 10 times the performance of an Intel Xeon chip while consuming far less power, or 40 times the performance of a digital signal processor from Texas Instruments, the company says.
And 64 cores is just the start.
Agarwal and other executives from the company will discuss the architecture further on Monday at the Hot Chips conference here at Stanford University. Researchers at Intel, IBM, Advanced Micro Devices, and the the University of Texas, among others, also will present papers.
Promising chip companies with strong technical backgrounds rise and fade out on a regular basis in the semiconductor industry. Still, Tilera is trying to tackle one of the thorniest--and thus one of the potentially most lucrative--problems for computer designers today: slow, clogged data paths. Processor speed and transistor count has climbed at a rapid, steady pace for decades, but the buses and interconnects between them get upgraded at a much slower rate.
HyperTransport, found inside processors from AMD, has probably been the most significant achievement in this regard in the last decade. HyperTransport accounted for a substantial percentage of the performance gains AMD achieved with the Athlon chip.
"The fundamental limitation of CPUs is no longer (core) performance but I/O (input/output)," Andy Bechtolsheim said in a presentation to reporters in June on Sun Microsystems' efforts in supercomputing. "You don't get more I/O just because you shrink the manufacturing process."
Sun has been working on a technology called proximity communication that allows different chips to talk to each other without wires by virtue of just being close. It's not ready yet.
Last September, Intel's Justin Rattner unveiled Intel's proposed answer: an 80-core chip in which the cores are linked through an embedded network.
The Intel chip is conceptually similar to the Tile64, Agarwal said. Intel, though, has given itself five years to come out with 80-core chips.
Tilera has already delivered samples to customers and will start shipping chips commercially in the fourth quarter. It has 12 customers including networking gear manufacturers 3Com and TopLayer.
Intel's 80-core chip, however, also contains Through Silicon Vias, which unclog the processor-to-memory pathways. The Tile64 employs conventional memory controllers.
Under the hood
Tilera's chips consist of small, individual building blocks, or tiles. Each tile sports a RISC processing core that runs at 600MHz to 1GHz as well as a switch that can send data in four directions: up, down, right and left. These switches form a mesh network, called iMesh, that lets the chips communicate.
The mesh network itself is also divided up into five layers, depending on the type of transaction. One layer handles cache-to-cache transfers, while another handles streaming data.
Each tile contains two caches of memory for rapid data access. Although each tile contains its own cache, the tiles can access all of the cache (depending on how it's programmed).
Individual tiles consume a low 170 milliwatts to 300 milliwatts on average. Cores also power up and down independently when not in use to cut power consumption.
The size of the chip, and its ultimate performance, depend on how many tiles are included. The first product will contain 64 tiles and a 5MB distributed cache. Next year, the company says it will come out with a less expensive 36-tile version and then a 120-tile version close to, or in, 2009. Tiles on a single chip can be grouped into virtual processors assigned to different computing tasks.
Performance gains over conventional chips arise directly out of Tile64's design. A distributed network of slower processors can get jobs done quicker and with less overall energy than two or four larger, faster, more complex cores. Rather than powering a large bus, the chip can rely on shorter connections.
The Tile64 runs Linux and can be optimized for different applications.
3 commentsJoin the conversation! Add your comment