• On TechRepublic: 10 cool USB flash drive tricks
August 13, 2009 1:31 PM PDT

How long is long-term storage?

by John Webster
  • Font size
  • Print
  • 27 comments

There is a big disconnect between how long people think they should be storing data and how long they actual can. One group of vendors and academics is trying to change that.

Two years ago, the Storage Networking Industry Association's Data Management Forum reported the results of a landmark study that looked at the state of long-term storage, i.e. preserving a digital object for more than 10 years. Some disturbing results jumped out.

The study suggested that we live in a digital version of the Dark Ages. I'm talking about it now because I think the messages from the study are still very relevant to both IT administrators and consumers.

A whopping 80 percent of the 276 organizations included in the study reported a need to retain electronic records for more than 50 years, so let's start there. How many of you storage administrators out there actually think you can do 50 years of electronic records retention given current technology? Without data loss? OK, so you won't be doing the same job 50 years from now, so why care? Next question: How many of you think that you can do more than three migrations of archival data from one storage media to the next without data loss? According to the study, the answer was very few of you.

Here's one for consumers: How many of you using Internet photo services sites think that your digitized images will still be there 50 years from now? You haven't thought about that, right? You and your spouse take pictures of the newborn today, you store them online, and maybe you store them at home, too. Here's a suggestion: make sure to print them and preserve the prints for as long as you can because if the enterprise-level storage administrators who have been doing digital storage for decades have little confidence in their ability to do long-term digital preservation, you shouldn't have much confidence either.

So there's a big gap here. A group of concerned vendors and academic advisers have formed the 100 Year Archive Task Force under the auspices of the Storage Networking Industry Association's Data Management Forum wants to start filling the gap. You can follow their progress or become involved yourself here.

One more result from the study still has me puzzled. Slightly more than half of the 276 organizations surveyed reported the need for "permanent" storage. What might fall into the permanent category? I thought of the Founding Fathers writing the U.S. Constitution and wondered what that process would have been like if they were all using a collaborative work-flow tool like Microsoft SharePoint. For sure, they'd print out the final version for all to see--on parchment maybe? But what about all the draft versions and messaging back and forth--in short, all the supporting documentation that clue us in on their state of mind and tell us what they really intended? Would they have printed out all of that, too? I dare say that insight would be gone forever.

We rarely, if ever, think of saving our digitized thoughts for the sake of posterity. But for the sake of historians, lawmakers, sociologists, and scientists yet to be born, we should--or people centuries from now might look back on this as the digital version of the Dark Age centuries from now.

John, a senior partner at Evaluator Group, has 30 years of experience in enterprise IT storage, spanning mainframe and open systems environments. He has served as principal IT adviser at Illuminata and has held analyst positions at IDC and Yankee Group Research. He also co-authored the book "Inescapable Data Harnessing the Power of Convergence." John is a member of the CNET Blog Network and is not an employee of CNET.
Recent posts from Data-driven
Is IBM's Blue Insight a model for your private BI cloud?
What integrated compute stacks mean for storage professionals
Will EMC's rising tide float all storage boats?
What the T-Mobile outage means for consumers
MaxiScale and the emergence of software-defined storage
The remodeling of EMC's executive office suite
VMworld 2009: Great for storage vendors
Georgens takes command at NetApp
Add a Comment (Log in or register) (27 Comments)
  • prev
  • 1
  • next
by MyRightEye August 13, 2009 1:38 PM PDT
Holographic storage will solve this issue within 10 years.
Reply to this comment
by erock1974 August 13, 2009 6:24 PM PDT
how will yet another new storage medium permanently solve this problem?
by timber2005 August 13, 2009 8:16 PM PDT
Compact... no degration... no overwriting.
by bluemist9999 August 14, 2009 8:10 AM PDT
Let's say it is compact, huge capacity, no degradation (ignoring the 2nd law of thermodynamics), no overwriting.

What about handling the constant changes in digital data formats? Even with perfect data preservation, could any modern programs read, say, a database stored on disk 40 years ago?

I'd argue that is a much larger problem than the finite lifespan of digital storage media.
by knowles2 August 14, 2009 2:25 PM PDT
bluemist9999 good point but apparently the European parliament or someone there who like to spend tax payer money have decided to start of developing a universial program capable of opening any electronic file type, and they are planning on keeping upgrading it as new formats die out and new ones replace them.
Also there was something about keeping around the hardware to.
by ikramerica--2008 August 13, 2009 2:00 PM PDT
Until then, transfer all stored data to a new medium every 5 years. Created a 6 month schedule to do this with all your data. Luckily, storage prices will have gone down 8 fold by each 5 year refresh, so the cost of doing so is not nearly as much as the initial cost.
Reply to this comment
by ti99_forever August 13, 2009 2:13 PM PDT
I once mirrored two computers hard drives. One drive failed. Before it was replaced, so did the other. A coincidence like this can ruin your day...

I can attest to this being an important issue. Remember when CDs became all the rage? Store your "long term" data there... I've got quite a few CDs from that era, and many are unreadable, especially the "blue-tint" ones.

Nice. And my 5 1/4" disks from 1982 are still readable today.

Hmm.

And if I lose one of those, I lost one disk of data, typically 360k. Lose a hard drive, you lose everything on it. 50gb for my current computer. Lose a CD, you lose what, 700mb? All at once.
Reply to this comment
by Nataku4ca August 14, 2009 3:25 PM PDT
just a note, thats why u make multiple copies and keep at least 3 of the latest ones, ie: backup aug 01, aug 02, aug 03

may lose a bit, but better than nothing and at the older the data the more chance of recovering it. lol at least im planning on going bluray and take this method, less dvd =.= 10gb of picture only and add on a couple other things, dvd started to look too small for the task
by HobbesDoo August 13, 2009 2:16 PM PDT
Any storage that's susceptible to damage due to the environment is not good enough. If we have any cataclysm that wipes out any information stored in electronic media we're back to the dark ages.

We need some media that's permanent once written.

Same situation would happen with paper documents in the case of floods.
Reply to this comment
by ti99_forever August 13, 2009 2:28 PM PDT
Very true. Consider how ancient documents that survived thousands of years were written.
Paper documents appear in fragments and require much work to salvage what can be saved.
Stone tablets were better.

Another issue is *what* to save. As the article alludes to, we have copies of the US Constitution, but not all the letters or dialogue exchanged that went into it. Is that important? I claim not as important as the end document.

Today, there seems to be a tendency to save *everything*. I'm being overloaded with information, from paper privacy notices to electronic info.
by sdf0013 August 13, 2009 2:51 PM PDT
The incremental data is a great question. I've wrestled with that one at the office for years. I always kept iterative or incremental works in case there was a problem. But I had never considered it in this manner. That'll get ya thinking for sure. That would certainly add to backup amount. Boy, try to write corporate policy on that.
by Hunnter2k3 August 13, 2009 3:08 PM PDT
>We need some media that's permanent once written.
There was actually a company who modded a writer to actually physically etch a pattern on to a round disc instead of just changing some chemicals.

Things such as this would be a god send, but the actual burning part might need to go through some safety tests first, and the whole process of dealing with any residue and by-products created from the process.

And for the sake of being future proof (even after humans might be long gone), the discs could be printed initially larger, then getting progressively smaller to the point of CDs pit-sizes.
There could also be some sort of pictogram that shows something like a magnifying glass that shows some letters larger.
Any race smart enough should understand there is hidden data on the drive.

The only problem is this data would be stored in something like binary, so unless some robots find it, i doubt they'd understand it.
And even then, they would have to have knowledge of ASCII/Unicode and hexadecimal to even know what it means. (and image formats, video, audio, etc)
Screw the future...
by EvanSei August 13, 2009 2:26 PM PDT
For this reason I store my important data on a portable hard drive, one of my laptops, and online, while really important data goes on all three of my computers, portable hard drive and online. And records someone could steal my identity with are never present in digital form on my machines.
Reply to this comment
by Jourdy288 August 13, 2009 2:48 PM PDT
I think that for long term storage, the Internet is a pretty safe bet, why not use it for something along the lines of its original purpose? Like ARPANET? Spread the info everywhere for quick retrieval?
Reply to this comment
by Random_Walk August 13, 2009 5:00 PM PDT
Yes and no.

Quick, what does www.beenz.com look like? Nine years (or so) ago, it was a huge to-do as a site for alternate forms of money. Whoopi Goldberg was on TV pimping the site for all she was worth.

Now? It's a parked domain with no original content. The best you can hope for is the Wayback Machine, which only holds partial content (whatever it was allowed to crawl). Here's that link from 2001:

http://web.archive.org/web/20001019084446/www.beenz.com/splash.html

Poke around in there awhile - you'll find half the images missing, none of the data working, and even the flash games are broke.

Here's one even better - pets.com was a site that got bought by PetSmart, and contains none of its original documentation or content. Maybe Petsmart has it, maybe they don't... but if you had anything on it, you probably don't now.

:)
by amontoya6 August 13, 2009 3:30 PM PDT
Well... History has shown that written documents, (papyrus, parchment, paper, ceramics, etching and others) are the best solution for the ages... (We still can read 6000 yr. old writings...), but we can not read : Disk tapes, vinyl disks or even floppy disks and a few "recently" formatted CD's, and this is technology less than 20 years old...
Of course, electronic archiving saves space and resources, but with the continuing changing technology (CD, DVD, DVDR, BlueRay, etc) we can not be in the sharp edge of technology every time that a change occurs.. What shall we do?
Reply to this comment
by billd888 August 13, 2009 4:12 PM PDT
So what happens if we have a major EMP event, pretty much all electronic media will stop working unless they have been shielded in some way. A company I know of has years worth of tapes in offsite storage, but many of them cannot be read because the original drives are long gone, even the physical form of the tapes are no longer used.
Reply to this comment
by dunlap_michael August 13, 2009 4:46 PM PDT
Author wrote "Slightly more than half of the 276 organizations surveyed reported the need for "permanent" storage. What might fall into the permanent category?"

How about medical data such as imaging, most diagnostic imaging has gone completely digital, instead of film images are now stored on PACS servers, this storage is considered indefinite at least in the sense of the records not being purged
Reply to this comment
by josephmartins August 14, 2009 12:21 AM PDT
Folks, your selection of a digital medium is important from a byte preservation perspective, but it has little to do with long term information preservation. Preserving bytes is not the same as preserving information.

I wrote about so-called "forward-compatibility" back in 2004:

http://www.datamobilitygroup.com/saltworks/archives/27#more-27

And it's still a problem today as you can see from John Webster's article.

I maintain that paper (esp. archival quality) is still one of the best forms of long-term storage for your most important personal documentation....treasured photos, birth certificates, marriage licenses, etc. Yes, it comes with its own issues, but it will not require nearly as much babysitting as it would in digital form.
Reply to this comment
by ghaff August 14, 2009 9:47 AM PDT
Agreed. There are certainly ways that physical media can be lost or destroyed as well but it's less likely to happen in an "oops" sort of way than in the case of digital. I certainly do my best to backup my photographs for example--multiple on-site backups plus a network-based service (Mozy). However, although I didn't initially do it for this reason, I'm glad that both myself and family members have a lot of my favorite photos printed out in a photobook. I would certainly hate to lose the original versions, but if worst came to worst, nice to know there's at least a reasonable fidelity copy in analog form.
by jsjohnson August 14, 2009 7:24 AM PDT
Um this issue is just now being dealt with. Researchers have developed a long term (think 1000 years) medium that's available TODAY (well in the next few months). Based on an obsidian type of ceramic material from what I recall, it's really pretty amazing. The next issue is to make the plastic live longer. You just don't get any better than carving into rocks. Now this wouldn't be quite as practical for enterprises since even the highest spec for BluRay is only 100GB but I can't see why we couldn't create a large maybe 10" disc for enterprise use (like 1TB). No question this technology is going to be very successful since National Archives is supporting it.

http://www.millenniata.com/index.html
Reply to this comment
by Heebee Jeebies August 14, 2009 8:43 AM PDT
I have been saying this for over a decade. When I got my first digital camera in 1990 I started to think that it was only a matter of time before things like photos, videos, music and the like were all computerized. I had hoped that since we had things like tape backup, CDs, DVDs and the like that that would be used for backup. But when hard drive size started to far out pace possible backup options to the point that to have one TB of storage and to back it up it cost twice the price (one drive for use and one to back it up) it became clear that in less than 30 years we will have a massive loss of records things like people family photos, videos, music libraries and more. I started to wonder what if anything the industries will do about it.

I have always felt that the National Archive should be expanded and opened to the public and it should be their job using tax payer money to backup and archive the data of the citizens of the united states. In a hundred years or even more or less that information will be important. Imagine if today was like Ancient Egypt 5,000 years ago what it would be like to have all of the information today. Imagine 1,000 years from now if our people could have access to all of today's digital files what they would know and learn. This information must not be lost. If is then we might as well do what the ancient Egyptians did and chisel it in to stone because that I fear will last longer than our digital files and there is something wrong when chiseled stone lasts longer than digital data. Stone degrades, digital doesn't.
Reply to this comment
by August 14, 2009 11:43 AM PDT
Yes, yes, yes. I have a box of 3.5" floppy diskettes from a Mac 128K with lots of cool stuff on it. No practical way to read them now, without cutting major backflips to find a working machine. Recall fondly that one of the peeps who introduced me to Mac waved around his newly purchased box of 10 disks @ 400K storage each saying "This is all the storage media I'll ever need!" (Giggle.)
Reply to this comment
by SergeM256 August 14, 2009 12:38 PM PDT
Having prolems with 3.5"? Do you remember 8" and 5.25' floppies? I still have 5.25" floppies somewhere. I think they have floppy and tape drives somewhere in Smithsonian.
by fdunn3 August 14, 2009 12:01 PM PDT
Since stone tablets survived from BC why not use a straight-forward well etching of standard Apha-numeric characters on Silicon at say 80nM? Sure it will take microscopy to view it but it will still be there? And I'm not talking about making it smaller than the head of a pin but maybe the size of a standard business card?

Given a standard format this could even be read electonically by optical devices with OCR capabilities.

If you really wanted it to last longer then encapsulate it with a quartz window.
Reply to this comment
by skyscraperjim August 17, 2009 10:02 AM PDT
Books and microfilm are still the best solutions for long term data storage. They can last well over 100 years and they do not require electricity.
Reply to this comment
by fazalmajid September 22, 2009 5:22 PM PDT
The limiting factor for the longevity of data is not technical, it is curatorial. Data can survive storage technology and even format changes (with some loss of fidelity, e.g. downconverting word processing formats to plain text) but it has a hard time recovering from the turnover of people who maintained it.

In most cases, this is a good thing. There are 150,000 books published each year in the US alone. I doubt more than a handful a year are truly worth preserving for posterity. The sooner the dross gets purged, the less it will clutter future efforts to unearth useful literature.
Reply to this comment
(27 Comments)
  • prev
  • 1
  • next
advertisement

E-tailers linked to 'scam' blame customers

Priceline, Classmates.com, and Orbitz say customers should read the fine print before complaining about being charged to join loyalty programs they didn't want.

The 411 on early-termination fees

Verizon Wireless has doubled its early-termination fees for smartphones, but what does it mean for the rest of the industry?

About Data-driven

Storage is more--way more--than a mere peripheral. In Data-driven, John Webster probes into storage technologies, the vendors behind them, and how customers use them in the context of market drivers such as Web 2.0, cloud computing, and the need to get meaningful information from the data fire hose that is now part of our daily life.

John is a senior partner at Evaluator Group. He has served as principal IT adviser at Illuminata and has held analyst positions at IDC and Yankee Group Research. He also co-authored the book "Inescapable Data Harnessing the Power of Convergence." John is a member of the CNET Blog Network and is not an employee of CNET.

Add this feed to your online news reader

Data-driven topics

advertisement

Inside CNET News

Scroll Left Scroll Right