• On TechRepublic: Why Android beats iPhone
June 10, 2009 4:54 PM PDT

Yahoo to distribute its version of Hadoop

by Tom Krazit

Yahoo announced plans Wednesday to release an open-source version of its take on Hadoop, a grid-computing framework used to run many parts of its business.

Yahoo is a major force in the development of Hadoop, which is principally overseen by the Apache Software Foundation. Hadoop is essentially an open-source version of the software Google uses to run its Web indexing servers, and Yahoo uses it for much the same purpose internally.

Hadoop runs on tens of thousands of servers inside Yahoo, said Nigel Daley, quality and release engineering manager for Yahoo Grid Technologies, in a blog post Wednesday. That's a much larger implementation than other companies and organizations might wish to deploy, but at the same time they would like to benefit from the reliability tweaks that Yahoo has made to Hadoop in order to support its enormous Web properties.

"This distribution is largely a response to the numerous requests that we have received to share Yahoo!'s internally tested and scale-proven releases," Daley wrote. The code is available for download immediately here on Yahoo's site.

Tom Krazit writes about the ever-expanding world of Internet search, including Google, Yahoo, online advertising, and portals, as well as the evolution of mobile computing. He has written about traditional PC companies, chip manufacturers, and mobile computers, spending the last three years covering Apple. E-mail Tom.
Recent posts from Webware
Popular iPhone movie app flops on BlackBerry
Opera Mobile 10 beta browser: First Look video
Google trying not to cross 'the creepy line'
Integrated retweet on its way to Twitter
Mozilla's e-mail group looks toward the cloud
Facebook: We're going after scammy ads, too
Alterna-browsers Firefox, Chrome get quick fixes
Offerpal Media mess gets stickier
Add a Comment (Log in or register) (4 Comments)
  • prev
  • 1
  • next
by June 10, 2009 10:24 PM PDT
Hi Tom, Yahoo should not distribute its own version of Hadoop. It will create a fork in the current Apache Hadoop distribution. And it undermines the work of Apache. Better way is for them to include whatever enhancements they have made back to Apache Hadoop distro.
Reply to this comment
by ddesy June 11, 2009 5:50 AM PDT
I think you're right about this. Many OSS projects end up hurt by forking, and Hadoop might be joining those ranks now. Contributing changes to the main project might take a little longer, but in the end it would probably be more beneficial to everyone involved.
by Seaspray0 June 11, 2009 7:25 AM PDT
So let me get this right... Yahoo is giving out the software that allows people to run their own mini search engine? This could be useful for say... an intranet site where the amount of data to be indexed is small. To do the internet, like what yahoo has done, would take one huge database beyond the capability of everyone unless they have the computer resources like yahoo.

Could this also be a sign that yahoo doesn't want to make search the backbone of its services?
Reply to this comment
by lonestarState June 11, 2009 7:41 AM PDT
Yahoo has released Hadoop enhancements to the Apache foundation. Life is a fork so why not release their own version. Case in point: LINUX distros
Reply to this comment
(4 Comments)
  • prev
  • 1
  • next
advertisement

About Webware

Say No to boxed software! The future of applications is online delivery and access. Software is passé. Webware is the new way to get things done.

Add this feed to your online news reader

Webware topics

FAQ: Buying the right Windows 7 upgrade

Readers still have lots of questions on just which version of the software they need to buy in order to upgrade their PC. CNET News tries to offer some answers.

N.Y. lawsuit details Intel's 'largesse' toward Dell

Attorney General Andrew Cuomo's federal antitrust case filed Wednesday alleges a longstanding symbiotic relationship between Intel and Dell.

Inside CNET News

Scroll Left Scroll Right