Hadoop breaks data-sorting world records
Yahoo's grid-computing team announced that Apache Hadoop broke world records in the annual GraySort contest in the Gray and Minute sorts in the general-purpose (Daytona) category.
Hadoop is the only open-source software to ever win the GraySort competition, adding another notch to last year's win at the Terasort competition, where Hadoop sorted 1 terabyte of data in 209 seconds. That beat the previous record of 297 seconds in the terabyte sort benchmark.
Within the rules for the 2009 Gray sort, our 500 GB sort set a new record for the minute sort and the 100 TB sort set a new record of 0.578 TB/minute. The 1 PB sort ran after the 2009 deadline, but improves the speed to 1.03 TB/minute. The 62 second terabyte sort would have set a new record, but the terabyte benchmark that we won last year has been retired.
If you want to learn more about Hadoop, the Cloudera blog has a great post titled 5 Common Questions About Hadoop that explains things pretty well.
Follow me on Twitter @daveofdoom
Dave Rosenberg dishes up "Software, Interrupted" with nearly 15 years of technology and marketing experience that spans from Bell Labs to multiple start-up IPOs to open-source enterprise software companies. He is co-founder of MuleSource and currently serves as the general manager of Hardy Way. He is a member of the CNET Blog Network and is not an employee of CNET. Disclosure. You can contact Dave via e-mail at softwareinterrupted@gmail.com or follow him on Twitter @daveofdoom. 




