Nvidia and Advanced Micro Devices' ATI division are taking different approaches to graphics processing in the next generations of their products. Both strategies have strengths and weaknesses, and I think it's too soon to pick the eventual winner in this long-running fight.
Before I get into my analysis, I should say that Nvidia paid me to write a white paper on the implications of its new GPU architecture (code-named Fermi) for high-performance computing applications. The white paper was released as part of the Fermi launch event at Nvidia's GPU Technology Conference last week.
Nvidia also paid for white papers from two other well-known microprocessor analysts, Nathan Brookwood of Insight64 and my friend and former colleague Tom Halfhill of Microprocessor Report. UC Berkeley professor David Patterson wrote a fourth white paper, and Nvidia wrote one of its own. All of these works take a different approach to the subject; all are worth reading if you need to understand what Fermi is all about.
In short, I think the Fermi architecture has been more thoroughly white-papered than any graphics chip design in history. All five of these documents are available on the Fermi home page on Nvidia's Web site, and just in case that page is moved or changed, you're welcome to take advantage of my own mirror of my white paper.
I've spent much of the last several days reading these documents plus David Kanter's excellent article on Fermi over on his Real World Technologies site. David managed to get some details on Fermi that Nvidia didn't give to the rest of us.
I've also had time to go through the coverage of ATI's recent launch of the RV870, which is what Nvidia's Fermi-based chips will be competing against. The first of Nvidia's chips bears the internal code name of GF100, and it's huge. Here's a life-size photo:
... Read moreApple's Snow Leopard operating system, which hits the streets on Friday, has plenty of new technology--but one of its major new features will soon be available on Microsoft Windows, Linux, and other major platforms.
OpenCL, the Open Computing Language, was originally proposed by Apple to support parallel programming on GPUs. There are other GPU programming languages, such as Nvidia's CUDA (Compute Unified Device Architecture) extensions for C and the Brook stream program language developed at Stanford University and included in Advanced Micro Devices' Stream Computing software development kit, but rather than choosing one of these languages, Apple chose to create a new standard independent of the big graphics vendors.
In fact, OpenCL is even independent of Apple. One of the first things Apple did was offer to hand it over to the Khronos Group, the same independent standards organization that manages the OpenGL standard for 3D rendering.
Supporters of the OpenCL standards effort at the Khronos Group include the biggest CPU and GPU makers in the industry. Apple is also involved but not shown here.
The members of the OpenCL working group turned Apple's draft specification into the released version 1.0 spec in just six months (see Brooke Crothers' "OpenCL goes beyond Apple" from last December)--and in the process, it created what may be the best solution so far to the general problem of parallel programming.
See, OpenCL isn't just for GPUs. It was designed from the beginning to get the most out of multicore processors too. After all, if you have a multicore CPU--and you probably do--why let it go to waste? OpenCL is flexible enough to support both CPU-optimized and GPU-optimized code, and smart enough to choose the right code, depending on what hardware is available in the system to run it. Most of the competing parallel-programming languages can't do that.
OpenCL can take advantage of both task-level parallelism (running many tasks at once, whether different tasks or copies of the same task) and data-level parallelism (where a single instruction within a task is applied to multiple data items at once--also known as SIMD). Some parallel-programming languages can't do that, either.
But OpenCL's biggest advantage isn't technical in nature: it's that no other parallel-programming language will be so widely supported. The support starts with Snow Leopard but will go well beyond that. AMD and Nvidia will have OpenCL drivers for their GPUs under Windows and Linux. AMD and Intel will support OpenCL on their CPUs (including Intel's Larrabee). And AMD has already shipped its first OpenCL implementation for its Athlon and Opteron processors.
Implementations for video game consoles and DSPs (digital signal processors) are also under development. I've even heard that future releases of OpenCL may be able to work with less common hardware, such as FPGAs (field-programmable gate arrays).
We had an excellent half-day OpenCL tutorial last weekend at Hot Chips 21. There were also some great OpenCL presentations at Siggraph 2009 earlier this month; if you'd like more detailed information, that's a good place to start.
All this support for OpenCL means that it should become the first choice of academic and commercial developers who want a good cross-platform way to develop parallel code. Expect to see OpenCL used in software for audio and video processing, cryptography, medical imaging, and many other applications--including, of course, gaming.
(Disclosure: I will be writing a technical white paper for Nvidia, one of the companies covered in this story.)
Welcome back to the ongoing Speeds and Feeds coverage of Hot Chips 19 at Stanford. They give us comfy chairs and free Wi-Fi, so blogging about it is the least I can do. By the way, Dean Takahashi of the San Jose Mercury News is also blogging from Hot Chips, so you can get another perspective on the event here.
Session 2 is the first of two sessions of "Multi-Core and Parallelism" presentations. This one happens to be all about Nvidia. Session 3, up next, will include presentations about AMD's ATI Radeon HD 2900, Intel's 80-core "Tera-Scale" processor, the TRIPS project at the University of Texas at Austin, and the Tile Processor from Tilera.
The first presentation in this session, "The Nvidia GeForce 8800 GPU," is an overview of that chip. As I mentioned in my Siggraph coverage, the 8800 includes 128 ... Read more
- prev
- 1
- next




