Software, Interrupted

Read all 'Storage' posts in Software, Interrupted
April 29, 2009 4:10 AM PDT

Is RAID storage living on borrowed time?

by Dave Rosenberg
  • 26 comments

The basic idea of RAID (Redundant Array of Inexpensive Disks) storage is combining multiple small, cheap disk drives into an array of disk drives (appearing to the computer as a single logical storage unit) that yields performance exceeding that of a SLED (Single Large Expensive Drive).

RAID offers many advantages over the use of single hard disks, including higher data security, fault tolerance, improved availability, and integrated capacity.

That said, RAID was invented more than 30 years ago and simply wasn't designed to work in the terabyte system world that is commonplace today. In fact, RAID is clearly beyond its design limitations for storage in the petabytes.

I discussed via e-mail the limits of RAID with Cleversafe CEO Chris Gladwin, and here's the problem as he sees it: RAID is mathematically reaching a breaking point for data reliability based on one-terabyte drives. RAID 6, based on parity, cannot recover from more than two simultaneous failures, or two non-simultaneous failures plus a bit rate error. It also doesn't automatically protect data, which remains exposed to software, hardware and user error.

Typical SATA drives have a published bit rate error (BRE) of 10^14, meaning once every 100,000,000,000,000 bits, there will be a bit that is unrecoverable. Although this failure rate seems insignificant, when reading 100 terabytes (note: 100 terabytes is 10^14 bits), it is nearly certain there will be an unreadable bit, and if this read happens to be during a rebuild, data will be lost.

There are still applications that can utilize RAID for increased I/O performance. For example, using RAID for a high I/O transactional system would be a good fit. Also, smaller storage applications, for example a terabyte or below, could still use RAID effectively.

Data continues to grow exponentially. Market researcher IDC estimated that the digital universe exceeded more than 281 exabytes in 2007 and will grow 10X by 2011. Enterprises in a number of industries, including media/entertainment, health care, and video surveillance, have already exceeded 100 terabytes of storage in use. Determining the appropriate long-term storage strategies for these industries will be a challenge as they realize the limitations of RAID.

The good news in addressing these data growth issues is the availability of low-cost processors and high-capacity drives. Combined, they provide great opportunities for disruptive innovations that will displace RAID.

January 22, 2009 5:44 PM PST

Chicago museum turns to open-source storage

by Dave Rosenberg
  • Post a comment

Chicago's Museum of Broadcast Communications (MBC) collects, preserves, and presents historic and contemporary radio and television content with the purpose of educating, informing, and entertaining the public through its archives, public programs, screenings, exhibits, publications and online access to its resources.

MBC also runs Museum.tv--which stores and delivers terabytes of digitized radio and television content. Currently, they are featuring a 1984 senatorial debate including Roland Burris--whom you may recognize as the senator just appointed to fill Barack Obama's vacancy (check out the protests at the start of the debate and how the moderator handles it).

Through the years, the primary challenge the not-for-profit MBC faced was the lack of resources to put together a large enough storage infrastructure to handle the massive amount of digital data they had and needed to present. At one point, when their single-server storage setup didn't foot the bill, they had to suspend their online services.

All told, MBC has more than 100,000 hours of content that it needs to store and distribute. Due to its size and lack of structured data, video remains inherently difficult and expensive to store. And let's not get started on reliability and security issues--achieving data reliability and security at the terabyte level using traditional storage methods based on replication requires significant hardware capacity at multiple sites.

When Cleversafe CEO and MBC-member Chris Gladwin heard about this, he contacted MBC to introduce Cleversafe's Dispersed Storage technology that could potentially solve their problems.

Here's how Dispersed Storage works: instead of copying data, Cleversafe divides it into "slices" and disperses it across a secure network to different geographic locations. Each slice contains too little information to be useful, but any threshold of the slices can be used to perfectly re-create the original data. Manageability? Yup. The sum of all the slices is still less than maintaining multiple copies of the original data.

One interesting sidebar to this--in addition to data storage, MBC also relies on Cleversafe for distribution instead of a separate content delivery network (CDN). When users view content on MBC's site, the data is pulled directly from Cleversafe and displayed via a media server in front of the Cleversafe hardware, saving MBC money and physical space without sacrificing performance or scalability for their end users.

  • prev
  • 1
  • next
advertisement

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About Software, Interrupted

In "Software, Interrupted," Dave Rosenberg discusses disruption in the software market, as well as the products and services that keep business technology norms in perpetual flux.

With nearly 15 years of technology and marketing experience spanning from Bell Labs to multiple start-up IPOs, Dave co-founded open-source software company MuleSource and now serves as general manager of Hardy Way. He also happens to be a U.S. patent holder and a workaholic. Technology is his best friend and mortal enemy.

Add this feed to your online news reader

Software, Interrupted topics

Most Discussed

advertisement

Inside CNET News

Scroll Left Scroll Right