Microsoft testing Excel for supercomputers
At a key supercomputing conference on Monday, Microsoft released a test version of its Excel spreadsheet redesigned to run on powerful clusters of servers.
By engineering Excel to run better on such clusters Microsoft said that customers are seeing spreadsheets that normally would take weeks to calculate now run in a few hours.
The software maker also released a beta version of Windows HPC Server 2008 R2--the latest version of Windows Server designed to run in high-performance compute clusters. The announcements were made at the SC09 conference in Portland, Ore.
Microsoft has taken the standard version of Excel 2010 and combined it with new Windows HPC Server 2008 R2 technology, allowing Excel to run on the cluster. The final version of Excel compute cluster and Win HPC Server 2008 R2 is expected to be ready in summer 2010. The capability has been in development for about 18 months.
The announcements are the latest in Microsoft's push over the last few years to better compete against Linux in the market for compute clusters--high-performance systems built by linking together large numbers of standard servers. Last year, for example, Microsoft managed to crack the upper echelons of the supercomputing ranks, landing in the top 25 rankings for the first time.
Microsoft also said the next version of its developer tools--Visual Studio 2010--will help ease the task of writing software that can run efficiently on such systems.
"Until now, the power of high-performance and parallel computing has largely been available to a limited subset of customers due to the complexity of environments and applications, as well as the challenges of parallel programming," Microsoft senior director Vince Mendillo said in a statement.
As for the new version of HPC Server, Microsoft said it offers the ability out-of-the-box to support clusters of up to 1,000 nodes as well as diskless boot and improved management and diagnostics abilities.
During her years at CNET News, Ina Fried has changed beats several times, changed genders once, and covered both of the Pirates of Silicon Valley. These days, most of her attention is focused on Microsoft. E-mail Ina. 





I swear I thought this was an article from "The Onion".. Who in their right mind would use Excel on a clustered server environment as a data and analytics package? Oh that's right thoughts who drank the coolaid long ago.
"compete against Linux in the market for compute clusters--high-performance systems built by linking together large numbers of standard servers"
Instead of using supercomputers to calculate your excel spreadsheets how about use a free database and a simple set of free reporting classes (python, php, perl, ruby). I bet you would use less hardware, bandwidth, and electricity. Oh yeah and not have to pay the $100,000 in licensing fees..
Excel is the workbench. People have hundreds of thousands of lines of macros written in Excel to do everyting from formatting to custom actions when hitting OLAP stores, etc... These people aren't going to change because you have this python utility that you cobbled together that doesn't work with any other asset they have.
And regarding power and cost. The computation is being run on the cluster. The cost of the computation on the cluster is likely not much different (and the Excel calc engine is pretty optimized, so if they're running that on the server, that likely beats the code that you're running). The cost of transferring the data back and forth is probably the smallest portion (since usually this type of computation only makes sense when the communication:computation ratio is low).
I'm not too knowledgeable about resource consumption when it comes to languages, but I was under the impression that interpreted languages like all four of the ones you listed would perform much slower than compiled languages like C. The fact they are interpreted might also deter parallelism. In Excel's defense, each entry in the spreadsheet could possibly be computed independently which happens to be something that lends easily to being computed over a distributed system.
Bandwidth is not important if it isn't being used optimally. If you are only extracting useful information from a small portion of the bandwidth, then it is just being wasted. There is a reason that a bigger bandwidth hasn't translated to better performance in all cases.
Convected felons would probably get to see their families more often than the sysadmins for that. :)
- by fdunn3 November 17, 2009 4:28 PM PST
- Very Cool Microsoft!
- Like this Reply to this comment
-
(10 Comments)