• On MovieTome: The 10 worst movies of 2009 so far!

Speeds and Feeds

Read all 'CUDA' posts in Speeds and Feeds
October 7, 2009 8:07 AM PDT

ATI and Nvidia face off--obliquely

by Peter Glaskowsky
  • 7 comments

Nvidia and Advanced Micro Devices' ATI division are taking different approaches to graphics processing in the next generations of their products. Both strategies have strengths and weaknesses, and I think it's too soon to pick the eventual winner in this long-running fight.

Before I get into my analysis, I should say that Nvidia paid me to write a white paper on the implications of its new GPU architecture (code-named Fermi) for high-performance computing applications. The white paper was released as part of the Fermi launch event at Nvidia's GPU Technology Conference last week.

Nvidia also paid for white papers from two other well-known microprocessor analysts, Nathan Brookwood of Insight64 and my friend and former colleague Tom Halfhill of Microprocessor Report. UC Berkeley professor David Patterson wrote a fourth white paper, and Nvidia wrote one of its own. All of these works take a different approach to the subject; all are worth reading if you need to understand what Fermi is all about.

In short, I think the Fermi architecture has been more thoroughly white-papered than any graphics chip design in history. All five of these documents are available on the Fermi home page on Nvidia's Web site, and just in case that page is moved or changed, you're welcome to take advantage of my own mirror of my white paper.

I've spent much of the last several days reading these documents plus David Kanter's excellent article on Fermi over on his Real World Technologies site. David managed to get some details on Fermi that Nvidia didn't give to the rest of us.

I've also had time to go through the coverage of ATI's recent launch of the RV870, which is what Nvidia's Fermi-based chips will be competing against. The first of Nvidia's chips bears the internal code name of GF100, and it's huge. Here's a life-size photo:

... Read more
August 28, 2009 9:50 AM PDT

OpenCL: Parallel programmers' new best friend

by Peter Glaskowsky
  • 11 comments

Apple's Snow Leopard operating system, which hits the streets on Friday, has plenty of new technology--but one of its major new features will soon be available on Microsoft Windows, Linux, and other major platforms.

OpenCL, the Open Computing Language, was originally proposed by Apple to support parallel programming on GPUs. There are other GPU programming languages, such as Nvidia's CUDA (Compute Unified Device Architecture) extensions for C and the Brook stream program language developed at Stanford University and included in Advanced Micro Devices' Stream Computing software development kit, but rather than choosing one of these languages, Apple chose to create a new standard independent of the big graphics vendors.

In fact, OpenCL is even independent of Apple. One of the first things Apple did was offer to hand it over to the Khronos Group, the same independent standards organization that manages the OpenGL standard for 3D rendering.

OpenCL working group member logos

Supporters of the OpenCL standards effort at the Khronos Group include the biggest CPU and GPU makers in the industry. Apple is also involved but not shown here.

The members of the OpenCL working group turned Apple's draft specification into the released version 1.0 spec in just six months (see Brooke Crothers' "OpenCL goes beyond Apple" from last December)--and in the process, it created what may be the best solution so far to the general problem of parallel programming.

See, OpenCL isn't just for GPUs. It was designed from the beginning to get the most out of multicore processors too. After all, if you have a multicore CPU--and you probably do--why let it go to waste? OpenCL is flexible enough to support both CPU-optimized and GPU-optimized code, and smart enough to choose the right code, depending on what hardware is available in the system to run it. Most of the competing parallel-programming languages can't do that.

OpenCL can take advantage of both task-level parallelism (running many tasks at once, whether different tasks or copies of the same task) and data-level parallelism (where a single instruction within a task is applied to multiple data items at once--also known as SIMD). Some parallel-programming languages can't do that, either.

But OpenCL's biggest advantage isn't technical in nature: it's that no other parallel-programming language will be so widely supported. The support starts with Snow Leopard but will go well beyond that. AMD and Nvidia will have OpenCL drivers for their GPUs under Windows and Linux. AMD and Intel will support OpenCL on their CPUs (including Intel's Larrabee). And AMD has already shipped its first OpenCL implementation for its Athlon and Opteron processors.

Implementations for video game consoles and DSPs (digital signal processors) are also under development. I've even heard that future releases of OpenCL may be able to work with less common hardware, such as FPGAs (field-programmable gate arrays).

We had an excellent half-day OpenCL tutorial last weekend at Hot Chips 21. There were also some great OpenCL presentations at Siggraph 2009 earlier this month; if you'd like more detailed information, that's a good place to start.

All this support for OpenCL means that it should become the first choice of academic and commercial developers who want a good cross-platform way to develop parallel code. Expect to see OpenCL used in software for audio and video processing, cryptography, medical imaging, and many other applications--including, of course, gaming.

(Disclosure: I will be writing a technical white paper for Nvidia, one of the companies covered in this story.)

August 20, 2007 2:45 PM PDT

Live from Hot Chips 19: Session 2, Nvidia

by Peter Glaskowsky
  • Post a comment

Welcome back to the ongoing Speeds and Feeds coverage of Hot Chips 19 at Stanford. They give us comfy chairs and free Wi-Fi, so blogging about it is the least I can do. By the way, Dean Takahashi of the San Jose Mercury News is also blogging from Hot Chips, so you can get another perspective on the event here.

Session 2 is the first of two sessions of "Multi-Core and Parallelism" presentations. This one happens to be all about Nvidia. Session 3, up next, will include presentations about AMD's ATI Radeon HD 2900, Intel's 80-core "Tera-Scale" processor, the TRIPS project at the University of Texas at Austin, and the Tile Processor from Tilera.

The first presentation in this session, "The Nvidia GeForce 8800 GPU," is an overview of that chip. As I mentioned in my Siggraph coverage, the 8800 includes 128 ... Read more

  • prev
  • 1
  • next
advertisement

With eye to the future, try raw photos today

Raw photos are a hassle compared to JPEG. But if you like photography, the list of their image quality advantages is long and getting longer.

Inside the Apple, er, Microsoft Store

Although Redmond's foray into retail bears a big resemblance to Apple's approach, Microsoft has added some distinctive features to draw casual PC buyers and techies alike.

advertisement

About Speeds and Feeds

Silicon Valley-based computer architect and chip analyst Peter N. Glaskowsky attends a variety of industry conferences throughout the year to meet with industry thought leaders and dig into the future of computing technology. In Speeds and Feeds, he analyzes trends in system architecture and interface design, as well as market and political pressures surrounding those trends. He is a member of the CNET Blog Network and is not an employee of CNET. Disclosure.

Add this feed to your online news reader

Speeds and Feeds topics

Most Discussed

advertisement

Inside CNET News

Scroll Left Scroll Right