Intel details future graphics chip at GDC
On Friday, Intel engineers are detailing the inner workings of the company's first graphics chip in over a decade at the Game Developers Conference in San Francisco--sending a signal to the game industry that the world's largest chipmaker intends to be a player.
During a conference call that served as a preview to the GDC sessions, Tom Forsyth, a software and hardware architect at Intel working on the Larrabee graphics chip project, discussed the design of Larrabee, a chip aimed squarely at Nvidia and at Advanced Micro Devices' ATI unit.
And Nvidia and AMD will no doubt be watching the progress intently. Intel's extensive and deep relationships with computer makers could give it an inside track with customers and upset the graphics duopoly now enjoyed by Nvidia and AMD. In the last decade Intel has not competed in the standalone, or "discrete" graphics chip market where Nvidia and AMD dominate. Rather, it has been a supplier of integrated graphics, a low-performance technology built into its chipsets that offers only a minimal gaming experience. (In the 1990s, Intel introduced the i740 GPU which, in relative terms, was not a success.)
Forsyth said that there is not yet a Larrabee chip to work with--it's expected late this year or early next year--and that "a lot of key developers are still being consulted on the design of Larrabee." But Intel will offer ways for developers to test the processor, he said. "On the Intel Web site there will be a C++ prototype library. It doesn't have the speed of Larrabee but has the same functionality. Developers can get a feel for the language, get a feel for the power of the machine."
Beyond games, Intel is also trying to catch a building wave of applications that run on the many-core architectures inherent to graphics chips. Nvidia and AMD graphics chips pack hundreds of processing cores that can be tapped for not only accelerating sophisticated games like Crysis but for doing scientific research and high-performance computing tasks.
One of the largest test sites for Larrabee is Dreamworks, which will use Larrabee for rendering and animation. To date, Dreamworks had to wait overnight to get a rendering project completed. "Using (the) Nehalem (processor), Dreamworks can almost do it in real time and it is only going to better with Larrabee," said Nick Knupffer, an Intel spokesperson.
Larrabee is "Intel's first many-core architecture," Forsyth said. "The first product will be very much like a GPU. It will look like a GPU. You will plug it into a machine and it will display graphics," he said. (GPU stands for graphics processing unit.)
"But at its heart are processor cores, not GPU cores. So it's bringing that x86 programmable goodness to developers," Forsyth said. Larrabee will carry the DNA of Intel's x86 architecture, the most widely used PC chip design in the world.
"It's based on a lot of small, efficient in-order cores. And we put a whole bunch of them on one bit of silicon. We join them together with very high bandwidth communication so they can talk to each other very fast and they can talk to off-chip memory very fast and they can talk to other various units on the chip very fast." In-order processing cores are used, for example, in the original Pentium design and in Intel's Atom processor.
"It's the same programming model they know from multicore systems already but there's a lot more of them," he said.
The centerpiece of the chip's core is the vector unit, used to process many operations simultaneously. "The interesting part of the programming model is the SIMD (single instruction, multiple data) vector unit and the instructions that go with it," Forsyth said. "We want to show off this big new vector unit and the instruction set."
Forsyth described what the vector unit can do and how it works with the scalar unit. "(The vector unit) can do 16 floating point operations every single clock. That's a lot of horsepower. Even in just one of these cores--and we have a lot of these cores. So it's a very high-throughput unit. The good thing is that it's independent of the scalar unit. You can issue instructions on the scalar unit and vector unit at the same time. The scalar unit is extremely useful for calculating addresses, doing flow control, doing housekeeping--and keeps all those miscellaneous tasks off the real powerhouse, which is the vector unit."
At GDC, Intel is encouraging developers to experiment. "They're going to have questions about how do I find 16 things to do at once. But a lot of it is just getting in there and playing with the thing," according to Forsyth. The GDC sessions will be a tour around Larrabee's instructions--"how to actually use these new instructions," he said.
And what about markets beyond gaming? "A funny thing happened on the way to the architecture. We designed this architecture to be 100 percent graphics focused. Whatever we needed to do to get graphics good, we did. And then a year ago, we looked at what we had and said how much of this stuff is actually specific to graphics. It turns out, very little. Graphics workloads are increasingly similar to GPGPU (general-purpose graphics processor unit), increasingly similar to high-powered (high-performance) computing. So, we actually have very little that is specific to graphics. Most of the instruction set is very general-purpose."
Brooke Crothers has been an editor at large at CNET News, an analyst at IDC Japan, and an editor at The Asian Wall Street Journal Weekly, among other endeavors, including co-manager of an after-school math-and-reading center. He writes for the CNET Blog Network and is not a current employee of CNET. Disclosure. 


its called Phenom II and they plan to release 32nm in Q4
we have to wait for Core i5 to see if
Intel will regain the lead in budget and midrange
Definitely the future is tilling toward graphics across the computer landscape. Graphics is the future of PC and perhaps server hardware.
Wonder what took then so long?
Since that time, non-gaming applications have increasingly made use of graphics acceleration and also, Nvidia and ATI have been tapped to make graphics processors for all 3 console platforms. (Sony, Microsoft, Nintendo) Back in the 90's, that wasn't the case. And of course, 3D acceleration is also becoming an issue in mobile phones. Add it all up, and its just too hard for Intel to ignore anymore even though they were thoroughly embarrassed last time.
It's hard to tell if Intel is more serious this time or not so far. During the 90's when they tried competing with Nvidia and ATI, they made a lot of noise too...but when the chips became available, they weren't even close to on par with Nvidia and ATI.
OpenCL ( http://www.khronos.org/opencl/ ) is designed to handle this, although the remit is somewhat broader than just GPGPUs.
Intel's determination to stick with in-order cores means they are going to have to run them extremely fast to keep up with a more conventional GPU, which means much high power and cooling requirements compared to a current video card (can you imagine an over-sized aftermarket CPU heatsink/fan stuck on your current video card? 'Cause that's the kind of cooling that would be required).
- by 3rdalbum March 29, 2009 1:02 AM PDT
- Larrabee is a joke. We're looking at about 64 threads of execution, versus over 200 that you'd get on a real GPU from Nvidia or ATI.
- Like this Reply to this comment
-
-
- by BigGuns149 March 29, 2009 6:11 PM PDT
- Agreed. I think that at best this will compete with the current low end discrete graphics(eg. Radeon 4350, Geforce 9500, etc.). The proof will be in a real world demo, but this will probably be a big improvement over their current integrated graphics at least although I don't expect this to cut much into the sales of discrete graphics.
- Like this
-
- by texaslabrat March 29, 2009 9:02 PM PDT
- don't forget about the SIMD portion of each core which can run 16 operations per clock. Look at the performance of modern applications when they are SIMD-enabled and when they are not. I'm not going to predict the performance of Larrabee...but I think the comments here have been over-simplifying the issues here and overlooking where the REAL performance in the chip will likely come from. At the end of the day, it's all about FLOPS and throughput. If a programmer can figure out how to keep the SIMD units busy, there's 16 GFLOPS per core available @ 1GHz clock not counting the scalar units. I don't know what they are planning on clocking these chips at..but I'd be surprised if it was as low as 1Ghz given the tech that Intel has developed for Nehelem. Some quick back-of-the-envelope guestimation shows that Larrabee *should* be comparable to current video cards in raw power..and probably a LOT easier to program for in the GPGPU arena due to the x86 ISA it's based on. So, I wouldn't be so quick to dismiss them...better to see what Dreamworks thinks of the chips after they've played with them a while....
- Like this
-
- by odubtaig April 2, 2009 7:03 AM PDT
- Also worth remembering that from SSE3 upwards there have been instructions available which perform different operations on different components of a scalar, they don't all have to have the same operation performed on them just because it's one instruction. I'll be quite interested in what these new instructions are for the 'vector' unit.
- Like this
-
(20 Comments)How fast will each thread be? Not very fast, really; they are general-purpose CPU cores packed together. In terms of being able to immediately program them in the same way you've been programming CPUs, it's great. In terms of graphics speed, it's going to be disappointingly unimpressive.
Remember that there's only 32 cores with two threads each, and the second thread of each core only runs 30% of the time (hyperthreading).
If this is going to replace the integrated GPUs that Intel currently sells, then this will probably be a step up and worthy of some attention. But I really don't see how we can expect good things from this GPU.