ie8 fix

reduce

Big data in context

A few weeks back I attended venture firm Accel Partners' New Data Workshop event and learned quite a bit about the state of what we are now commonly referring to as "big data" and the challenges that await the vendors trying to target this new way of slicing and dicing vast amounts of information.

One of the big takeaways for me was the realization that even with all of the processing power available nowadays, the amount of data is growing at such a rapid pace that people are simply looking to cope with the problem, rather than facing it head on.

The issue of processing large amounts of data is not necessarily new--most developers and IT staff can tell you about having too much information to deal with--but, the big difference is that there are new approaches, tools and technologies that can help alleviate the difficult in processing.

Over the course of the last 30 years or so the way that machines process transactions has changed, but so too has the vast amount of data that is being processed and collected, now with an eye toward real-time analysis of information.

This has led to the advent of a number of technologies that allow for data processing to be offloaded and managed in both structured and unstructured ways--examples include open-source projects like Memcached and Hadoop as well as NoSQL data storage mechanisms like Cassandra.… Read more

Could open source abandon the Google train?

As arguably the world's largest open-source company, Google has a big stake in maintaining its place at the heart of the open-source ecosystem. Recent events, however, suggest that Google can't rest on its laurels if it wants to secure the hearts and minds of open-source developers.

Make no mistake: Google needs those developers. Android, Chrome (and Chrome OS), and other Google initiatives depend upon fostering vibrant open-source communities that can help it to surpass Microsoft and Apple.

Such communities may be ready to cut the Google umbilical cord, however, which should be worrying to Google.

There have been … Read more

Keep your burgers lean

There are just some days where nothing but a juicy burger will cut it. We all know exactly how healthy a big burger is, though. But there are ways to feel slightly better about your hamburgers. The Burger Buddy Fat Reducer offers a fast way to eliminate fat. While it may look like a standard burger mold, the Burger Buddy Fat Reducer is actually used in a different way. After you finish grilling your burger patty, you place it in the press, flush out the fat with hot water and press the patty to maintain the great grilled flavor. Grease … Read more

Car cost cutter

For many, their car is second only to house payments as their biggest expense. Add in "extras" like fuel, maintenance, and insurance, and the car sometimes edges ahead. Most car owners know they'd probably save money if they paid closer attention to where it was all going. The aptly named Reduce Car Costs helps you organize and track your vehicular expenditures so you can throttle back on the cash flow. It'll show you just what you're spending your money on, and when, where, and how, too, so that you can see where you can cut … Read more

MySpace to open source data processing

MySpace today announced a new open-source project called Qizmt, a distributed computation framework developed by its data mining team.

Qizmt is based on the MapReduce distributed processing framework, well-known as a core part of Google's search indexing infrastructure. Qizmt, however, runs on large clusters of Microsoft Windows servers, an interesting sidebar to a computing style we most commonly associate with commodity Linux machines.

MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, … Read more

Hadoop buzz continues to excite the cloud

Hadoop is the popular open-source implementation of MapReduce, a powerful tool designed for deep analysis and transformation of very large data sets. It enables you to explore complex data, using custom analyses tailored to your information and questions. It's also one of the most buzz-worthy, talked about open-source projects around.

I spoke with Christophe Bisciglia, Hadoop World organizer and founder of Cloudera, to ask some questions about this inaugural event. And by the way, if you're interested in attending, click on the link in the answer to question No. 5. (My readers get a 25 percent discount if you register before September 15.)

Q: How can you explain the buzz around Hadoop? It's deafening. … Read more

No shrinking violet

PicShrink is a handy little image compression utility that does more than just resize digital pictures. It also has basic image-editing features, such as cropping, borders, and the ability to adjust contrast, brightness, and intensity. You can resample images and convert file formats, too. It even makes it easy to add digital watermarks to protect your images from unauthorized use.

Opening the fully functional trial version is a little balky because it cleverly appears as if your only options are to order or register the full version, but clicking anywhere on the startup screen opens the interface. The interface itself … Read more

More universities join Yahoo for Net-scale research

Yahoo has signed up three new universities to participate in Internet-scale computing research, the Internet pioneer said Thursday.

The University of California-Berkeley, Cornell University, and the University of Massachusetts-Amherst have joined an effort that already included Carnegie Mellon University, Yahoo said Thursday. The universities get access to a cluster of Yahoo computers called M45 that runs open-source software called Hadoop that can be used to process data rapidly.

Yahoo is a major contributor to Hadoop, a project within the Apache Software Foundation's collection, but Google created the underlying technology through its MapReduce algorithm. MapReduce and Hadoop can be used … Read more

Amazon launches Hadoop data-crunching service

This was originally posted at ZDNet's Between the Lines.

A correction has been made to this story. See details below.

Amazon on Thursday announced a new cloud computing service that uses Hadoop, a free software framework, to crunch tons of data.

The service, called Amazon Elastic MapReduce, is designed for businesses, researchers and analysts trying to conduct data intensive number crunching (statement). Hadoop, which is used by companies like Yahoo, is trying to be pushed into the enterprise data center by start-ups like Cloudera.

Correction, 7:15 a.m. PDT: This story initially miscast Google's connection to Hadoop. … Read more

Understanding MapReduce and Hadoop (Video)

For those of you interested in just how cloud computing (and I do mean, computing) works, check out this video from a recent AWSome Atlanta Cloud Computing User's Group. Twitpay's Don Brown explains how open source applications MapReduce and Hadoop are used to process enormous amounts of data at Google and other large websites.

For more on MapReduce, check out these articles by Eugene Ciurana. For more on Hadoop (including support) check out Cloudera.

Via John M. Willis

You can follow me on Twitter @daveofdoom