October 10, 2006 3:31 PM PDT
Quad-core Opteron faster at virtualization, AMD says
Virtualization, a hot subject today as companies try to make servers more efficient, lets a single server run multiple operating systems. But virtualization software called a hypervisor, which oversees the operating system access to the hardware, poses performance problems compared to operating systems running on their own.
Barcelona has specific features to deal with some of those performance issues, Ben Sander, a principal member of AMD's technical design staff, said in a speech Tuesday at the Fall Processor Forum here.
AMD and Intel are vying for share in the x86 server market. Intel's Xeon chips were the first to provide some hardware support of virtualization, but AMD's newest processors now also support it.
But virtualization is just one arena; the companies also are racing to add more processor cores. Intel's "Clovertown" version of Xeon houses two dual-core Xeon 5100 "Woodcrest" processors into a single package to reach the quad-core goal this November. AMD's Barcelona puts all four cores on a single slice of silicon.
One performance problem comes because operating systems are accustomed to handling a part of the chip called the translation lookaside buffer, or TLB, which converts an operating system's relative memory addresses into the actual addresses used by the hardware. But with a hypervisor actually in charge of memory, virtualization adds a second level of translation to the task.
To deal with the situation, hypervisors use software called shadow paging. "It's complex to implement and can be fairly slow," Sander said. Barcelona technologies, including "nested page tables" and the caching of memory addresses, speeds up the memory issue.
That's significant, given that such memory issues can occupy as much as 75 percent of the hypervisor's time, he said.
In addition, Barcelona has new instructions that shorten the chip's "world switch time," when it switches from guest operating system mode to hypervisor mode and back. Such a switch typically takes about 1,000 to 2,000 processor cycles, but the new instructions shorten that by about 25 percent, Sander said.
Sander described other features of Barcelona, as well. For example, each processing core will have a 64KB level-one cache and 512KB level-two cache. All four cores share a 2MB level-three cache, but that can be made larger, he said.
Barcelona can handle a larger amount of physical memory than current Opterons. Today's a maximum is 1 terabyte--which the lower-end AMD-based servers don't reach--but Barcelona will stretch that to 256 terabytes, he said.
Intel has switched its current generation of dual-processor servers to a new memory technology called FB-DIMM (fully buffered dual inline memory modules), but AMD is passing on the technology for the time being, Sander said. Barcelona has FB-DIMM abilities built in, but they won't be used because it draws more power and has longer communication delays than standard DDR2 (double data rate 2) memory.
AMD will make the transition to FB-DIMM "at the appropriate time," but apparently that will be at least with a second generation called FB-DIMM 2, Sander said. "With FB-DIMM generation one, we decided it is not an appropriate time to transition. It is still supported by the memory controller," he added.
In addition, he said Barcelona has dual memory controllers to read and write data from memory. That's the same number as current Opterons, but with Barcelona, the memory controllers will be able to operate independently, he said.
1 commentJoin the conversation! Add your comment