April 29, 2009 4:10 AM PDT

Is RAID storage living on borrowed time?

by Dave Rosenberg
  • Font size
  • Print
  • 26 comments

The basic idea of RAID (Redundant Array of Inexpensive Disks) storage is combining multiple small, cheap disk drives into an array of disk drives (appearing to the computer as a single logical storage unit) that yields performance exceeding that of a SLED (Single Large Expensive Drive).

RAID offers many advantages over the use of single hard disks, including higher data security, fault tolerance, improved availability, and integrated capacity.

That said, RAID was invented more than 30 years ago and simply wasn't designed to work in the terabyte system world that is commonplace today. In fact, RAID is clearly beyond its design limitations for storage in the petabytes.

I discussed via e-mail the limits of RAID with Cleversafe CEO Chris Gladwin, and here's the problem as he sees it: RAID is mathematically reaching a breaking point for data reliability based on one-terabyte drives. RAID 6, based on parity, cannot recover from more than two simultaneous failures, or two non-simultaneous failures plus a bit rate error. It also doesn't automatically protect data, which remains exposed to software, hardware and user error.

Typical SATA drives have a published bit rate error (BRE) of 10^14, meaning once every 100,000,000,000,000 bits, there will be a bit that is unrecoverable. Although this failure rate seems insignificant, when reading 100 terabytes (note: 100 terabytes is 10^14 bits), it is nearly certain there will be an unreadable bit, and if this read happens to be during a rebuild, data will be lost.

There are still applications that can utilize RAID for increased I/O performance. For example, using RAID for a high I/O transactional system would be a good fit. Also, smaller storage applications, for example a terabyte or below, could still use RAID effectively.

Data continues to grow exponentially. Market researcher IDC estimated that the digital universe exceeded more than 281 exabytes in 2007 and will grow 10X by 2011. Enterprises in a number of industries, including media/entertainment, health care, and video surveillance, have already exceeded 100 terabytes of storage in use. Determining the appropriate long-term storage strategies for these industries will be a challenge as they realize the limitations of RAID.

The good news in addressing these data growth issues is the availability of low-cost processors and high-capacity drives. Combined, they provide great opportunities for disruptive innovations that will displace RAID.

Dave Rosenberg dishes up "Software, Interrupted" with nearly 15 years of technology and marketing experience that spans from Bell Labs to multiple start-up IPOs to open-source enterprise software companies. He is co-founder of MuleSource and currently serves as the general manager of Hardy Way. He is a member of the CNET Blog Network and is not an employee of CNET. Disclosure. You can contact Dave via e-mail at softwareinterrupted@gmail.com or follow him on Twitter @daveofdoom.
Recent posts from Software, Interrupted
Video games outsell movies in U.K.
Android and iPhone users not so different after all
Flexing the boundaries of flash memory
LG, RIM top Apple in number of phone users
A modern approach to Java application development
Mountain Dew drinks up social media (Q&A)
Top ad trends list spotlights online behavior
IBM closes lackluster M&A year with buying spree
Add a Comment (Log in or register) (26 Comments)
  • prev
  • 1
  • next
by amadensor April 29, 2009 6:21 AM PDT
I do not believe that the concept of RAID is over, but some of the algorithms used to do parity checking may have outlived their usefulness.
Reply to this comment
by Random_Walk April 29, 2009 11:29 AM PDT
Ditto. The concept of RAID itself is just that... a concept. The headline misleads just a tad there.

OTOH, sure - I can get that the parity algorithms are more than just a little outdated. I also get the concept that even in spite of on-the-fly de-duplication (and other data mass shrinking means), there's going to come a time when the amount of data being stored will exceed the ability of existing means to store it. Then again, it's not like we've not see such a problem before.

I wonder if there's a corollary or a clone of sorts to Moore's Law, but only pertaining to storage?
by dcis_steve April 29, 2009 8:42 AM PDT
That's plenty about why RAID is going the way of the Dodo, but I was hoping to read more than two sentences about the new technologies that will replace it.... You left me hanging!
Reply to this comment
by JulieBella April 29, 2009 2:04 PM PDT
Dispersal is a viable method to displace RAID and replication. You can read Cleversafe's whitepaper comparing RAID and Dispersal along with models for when RAID 5 and 6 will fail, and why Dispersal continues to work into the petabyte range:
http://www.cleversafe.com/vision/download-whitepaper

Simply register, and download 2nd whitepaper.
by Seaspray0 April 29, 2009 2:31 PM PDT
Raid 6 isn't bad. Instead of 1 drive, you use 2 drives for parity in a raid. If two drives fail, you can still recover the data.
by danielwsmithee April 29, 2009 8:46 AM PDT
That cause there really is no technology available to replace it.
Reply to this comment
by odubtaig April 29, 2009 11:41 AM PDT
Follow the link.
by odubtaig April 29, 2009 11:58 AM PDT
Durr, that was meant to be in reply to Steve. Never mind.
by bonesonrong April 29, 2009 9:00 AM PDT
Wozniak had joined Fusion-io's advisory board last October to help the company ramp up adoption of its solid-state flash technology. Now as chief scientist he will "act as a key technical advisor to the Fusion-io research and development group" and help formulate a strategy to "accelerate the expansion of major global accounts," Fusion-io said in an announcement. A spokeswoman for Fusion-io said Wozniak's new position is full-time.
Reply to this comment
by bonesonrong September 21, 2009 2:14 PM PDT
http://messages.finance.yahoo.com/Stocks_%28A_to_Z%29/Stocks_B/threadview?m=tm&bn=73279&tid=1326&mid=1326&tof=1&frt=2 parting shots???
by bonesonrong April 29, 2009 9:05 AM PDT
Note: There is an easy means to reference OID descriptions from your web site or documents by way of the http://www.oid-info.com/get/ address followed by an OID (more in the FAQ).
Reply to this comment
by chrisfrary April 29, 2009 9:33 AM PDT
Exactly RAID will continue on, the algorithms used will change just like amadensor said. NAS can easily handle the processing requirements to correct errors that will be caused.
Reply to this comment
by Maarek Stele April 29, 2009 10:31 AM PDT
NAS has probably developed their own algorithm that replaces RAID, but will not distribute it since they don't want data traveling faster than they can monitor.
by odubtaig April 29, 2009 11:01 AM PDT
That's Network Attached Storage, not National Security Agency.
by bonesonrong May 1, 2009 2:29 PM PDT
Alogrithims are farahead of site use.
by Michichael April 29, 2009 11:16 AM PDT
Am I the only one amused by CONDOM ads on a site most people use for reviewing tech news at work? >.>
Reply to this comment
by odubtaig April 29, 2009 11:56 AM PDT
That is actually quite clever (I make no aplogies).

What it looks like from here, is that every file stored is split into equal chunks and each chunk has eg 75% of the parity/checksum/compressed total information required to rebuild eg 75% of the other chunks attached to it so as long as three out of four chunks are available then the file is still recoverable.

These chunks are then distributed across a SAN which can be LAN or WAN based, the latter of which would be very useful in terms of offsite data recovery if you can afford the bandwidth.

A simple but very effective idea (of course, implementation's a different matter entirely).
Reply to this comment
by bonesonrong May 1, 2009 3:55 PM PDT
Then there is BitTorrent
by odubtaig May 4, 2009 5:16 AM PDT
Bittorrent doesn't keep the entire replication bandwidth down to less than twice the amount of data to be stored, it still has to upload all the data for every site (6 sites, 600% storage and bandwidth utilisation). This distributes each file once across multiple sites.

While I fully expect them to put the best gloss on their own product with their claims of storage/bandwidth utilisation it's still not possible for storage/bandwidth to go over 200% no matter the number of sites.
by meh130 April 29, 2009 2:51 PM PDT
There are two problems here. One is data corruption caused by a BER on a bad block. The other is a disk failure. RAID is designed to deal with disk failures, not BER. A RAID array might think all is well, and there could be corrupted data caused by a bad block. That is where a filesystem adds value. End to end data integrity using checksums has been implemented in filesystems (i.e., ZFS). RAID will continue as a means of dealing with disk failures. TB sized disks are an issue with RAID rebuild times, and RAID-6 may not be enough. Expect RAID-1+0 to return to deal with anything transactional, and RAID-6 to be used for archival and read-only storage. End to end data integrity using checksums will have to be built into storage systems, along with scrubbing algorithms to check disk integrity and intelligent migration and disk aging monitoring.
Reply to this comment
by mulberrybush April 29, 2009 11:01 PM PDT
http://btrfs.wiki.kernel.org/index.php/Main_Page

A check-summing file-system designed for large data stores under development at the moment. Copy on write too...
by biffhenerson April 29, 2009 3:06 PM PDT
We need high performance technology that will saturate the buss with data today and into the future. The solution has also got to be fault tollerant because our data is valuable. RAID has done a fair job at providing that service in the past. Thanks RAID. Who knows what the future holds. I am anxious to try some RAM PCIx cards and perhaps the SLC solid state drive once they get the kinks worked out. NAS is a good idea when several computers are involved. There is so much wasted storage on an office full of desktops. An most of it is not being backed up. So perhaps some fancy RAM NAS box.
Reply to this comment
by ikramerica--2008 April 30, 2009 2:18 PM PDT
I think RAID is an effective way to increase the performance of SSD. Currently, it's used by Panasonic on P2 cards, and could easily be used in desktop computers, fitting two 2.5" SSDs into the space of one 3.5" drive in a RAID-0 via hardware fashion. The cost of doing redundant RAIDs with SSD may be too high, but in theory, using the right controller, the redundant drive could be a traditional spinning platter drive.

Again though, this is an end user usage of RAID, not a massive storage array usage, but RAID will continue to have it's place in the end user world.
Reply to this comment
by bonesonrong May 1, 2009 2:30 PM PDT
I am more concerned about the Infrastruture uses of that mass flash equivalent and GPS , actually everything will change.
Reply to this comment
by idfubar May 12, 2009 10:31 PM PDT
Storage is hierarchical (slow/expansive/cheap at one end and fast/minimal/expensive at the other); RAID has it's place regardless of the addition of zeroes to the numbers (it is, after all, parallelism).
Reply to this comment
by mscritsm July 27, 2009 11:23 PM PDT
If hard bit errors are approaching 1 in 100 TB, then disk drives themselves are destined to die as a technology. In a few years, a single disk drive will approach 100 TB of capacity, meaning the drive manufacturers will no longer be able to sell drives that are now guaranteed to come with at least one hard error on them at all times.

No, drive manufacturers will have to reduce hard bit errors even as they increase capacity or face extinction (in which case RAID becomes moot anyway). But if drive manufacturers do decrease the bit error rate in order to survive, the arguments against RAID based on bit errors will become invalid.
Reply to this comment
(26 Comments)
  • prev
  • 1
  • next
advertisement

15 sites that went kaput in 2009

Web sites launch all the time, but they also shut their doors. We highlight 15 that bit the dust this year.

Top 10 news stories of the decade

Let the debate begin: Was the iPhone more important than iTunes? Was anything bigger than Google finding a great business model? CNET offers its list of the 10 most important stories of the '00s.

About Software, Interrupted

In "Software, Interrupted," Dave Rosenberg discusses disruption in the software market, as well as the products and services that keep business technology norms in perpetual flux.

With nearly 15 years of technology and marketing experience spanning from Bell Labs to multiple start-up IPOs, Dave co-founded open-source software company MuleSource and now serves as general manager of Hardy Way. He also happens to be a U.S. patent holder and a workaholic. Technology is his best friend and mortal enemy.

Add this feed to your online news reader

Software, Interrupted topics

advertisement
advertisement

Inside CNET News

Scroll Left Scroll Right