At Storage Networking World in Phoenix this week, there was a buzz in the hallways and over breakfast tables about the T-Mobile Sidekick outage that was due, according to Microsoft, to "a system failure that created data loss in the core database and the back-up." And why not? There are about 800 enterprise-level storage administrators here. The backup process is squarely in their space as is data recovery and data integrity. Some of their colleagues and some vendors represented on the show floor were at Sidekick ground zero pulling data from the wreckage.
SNW attendees knew that there were many fingers pointing in many different directions over this and my finger shan't be one. However I will go out on a limb and say that my understanding of the situation is that it was not a result of sabotage as was once rumored. Rather it was due to failure of two coincidental processes, in this case a data migration failure that was preceded by a backup failure.
Microsoft now says that little if any data was actually lost. T-Mobile Sidekicks are being restored and all or almost all will be made right again. Life will go on normally as if nothing really happened.
Really? Put yourself in the shoes of a Sidekick user for a minute or two. Do you know where your data is and I mean all of it? And, to borrow a line from an old Dustin Hoffman movie: is it safe?
Take an inventory. You have data on your desktop, laptop, Palm device, smartphone, entertainment center, home network...Then ask yourself: how much of this data could you lose without caring whether you ever used it again? Certainly some, perhaps a lot of it, will fall into the data dumpster category. But the T-Mobile scare is yet another reminder that each of us owns data that has become critical to our daily activities. Could you function if someone grabbed your smartphone and ran away? For an increasing number of us, the answer is yes, but with ever greater difficulty. Some other data about us, our medical records for example, are life-critical.
Next, try to figure out how much of that critical data you actually have control over and then back it up. Immediately. Don't trust others to do it for you. Take control and make copies locally and/or using one of the many online backup services.
As one of my Twitter compatriots SEPATONjay observed over breakfast this week, if the service level agreement between T-Mobile and Microsoft couldn't prevent this failure, how good are the SLAs between any of the rest of us consumers and our services providers. Take an inventory of the services providers that hold your data, then read their contracts, (assuming you can find them). I'll bet all of them indemnify the vendor against the loss of your data. If you can't protect that data, don't assume they will. Use a service that offers you a way to protect the data you deem critical.
Think your patient record is beyond your control? News item: you own your patient record no matter what your health care provider might say to the contrary. Get a copy and keep that copy up to date. I'm even going to go so far as to suggest that sometime in the next five years you have your genome sequenced. Store a copy of that in a safe place as well.
We are the mistresses and masters of our data domains. We can cry foul when someone else loses our data. We can even sue. But when data is destroyed--as in gone forever--no outcry no matter how loud will get it back. Protect it and win applause from the storage administrators who assembled here in Phoenix this week.
For the last two decades, RAID (redundant array of inexpensive disks) controllers have ruled the storage world. RAID has been required for data protection in disk arrays. RAID schemes (RAID 0,1,6 10, etc.) reside on RAID controllers baked into disk arrays with many billions sold to date. But perhaps more important from the standpoint of making money, the RAID controller has also delivered differentiated value for storage vendors. Data copy and migration, snap shot, deduplication, and the list of controller-based functions goes on--all have been loaded on to the RAID controller.
It's becoming increasingly clear that the traditional RAID controller is coming to the end of its life cycle, at least within the enterprise data center. Types of applications now common to the Web 2.0 community are now populating the enterprise data center--applications that require scalability into the petabyte range. Traditional RAID controllers start to show their shortcomings at this scale level. Drive rebuild times elongate to the point where RAID data protection is no longer protection.
We can argue (and I have) over how much longer the RAID controller will survive. For sure, it's nowhere near dead and will continue on as the workhorse of the storage industry for some time. But its shortcomings are becoming increasingly obvious and are driving the creation of the next generation of storage devices. Indeed one of those devices is no "device" at all. Rather, it's software running on a collection of commodity servers and server-attached disk, both traditional and solid state disk. Think of this new "device" as software-defined storage where all of the functionality is defined and delivered in software. So as a user, when you buy a software-defined storage device, you're simply buying code. What you run it on is up to you.
MaxiScale is an interesting example of software-defined storage. MaxiScale's FLEX storage platform runs on standard servers with SATA disk, and uses standard Ethernet interconnections. It is implemented as clustered nodes--servers plus disk. I/O performance and capacity scales linearly as processing nodes and disk drives are added to the cluster.
So the storage value-delivery model is decidedly different here. You as the user buy software and essentially roll you own array. But what else is different here? First, while the RAID controller is gone, the absolute requirement to preserve data is not. Data protection is also implemented in software.
Second, the system assumes that individual nodes within the cluster will go off line or fail for one reason or another. That's OK. The FLEX storage cluster continues to function, perhaps at some degraded state for some period of time until the full cluster is restored. But the point is that once you power up the cluster, you can keep it running for years--decades if you want. Hardware is added and replaced without disruption. Software is upgraded without disruption. It's perpetual storage.
Third, FLEX is an expression of the state of the art in single or global namespace file system technology. It's this core technology that delivers the value-added storage services rather than the RAID controller.
MaxiScale is not alone in this emerging space. Other software-defined storage solutions include ParaScale's cloud storage software and Symantec's FileStore. Other traditional hardware and software players will follow with software-defined storage offerings in the coming months. Include database vendors in this space as well. Some will position their solutions as cloud storage, others as data protection and archival storage.
Will software defined storage replace traditional RAID storage? Not immediately. Not dramatically. But to me a new model is emerging. Scalability, hardware independence, and system longevity are the more compelling features when compared to traditional RAID-based storage arrays. But perhaps the most compelling feature will be an ability to buy big array performance and scalability at a fraction of the cost of big array RAID.
For a storage guy, last week's VMworld 2009 in San Francisco was a great show. All the familiar storage vendors were there and then some. Walking the show floor, I found them to be uniformly positive about traffic and the response they were getting from attendees.
Digging a bit deeper I found that storage vendors were getting attention from a broad range of IT specialists including server, network, architecture, and of course, storage administrators.
Wait a minute. VMworld isn't supposed to be a storage show. And yet storage vendors were, in general, more positively impressed with VMworld 2009 than many of the previously attended storage-focused shows they have been to in the recent past.
Server virtualization is now reordering the IT landscape, and the ground storage vendors have stood on for years is moving under their feet.
At varying levels, storage vendors feel the motion. They know the server virtualization thing is huge opportunity. Said another way, they fear that they could eventually disappear if they don't position themselves properly in the eyes of IT buyers now driving toward near complete if not total virtualization of the enterprise IT function.
Decades ago, storage was a mere peripheral, a feature of the server as Scott McNealy once famously quipped. But as he made that pronouncement, storage was getting connected to its own network and creeping out from behind the shadow of the server into a limelight all its own. EMC perhaps said it best: Storage--Where Information Lives.
Now that networked storage is mature, the ground is moving once again. Data and storage management is heading back toward the server running VMware. Data replication and storage provisioning functions are now features of the VMware server with more to come.
Beyond IT architecture, the architecture of the virtualized IT operations department is undergoing perhaps an even more profound change. The boundaries that once defined operational "silos"--server, network, and storage administration--are breaking down as vCenter becomes the focal point for VMware-managed IT. Hence, storage vendors here at VMworld 2009 get visitors from all walks of VMware operational life.
What am I taking away from VMworld 2009?
- VMware needs to resist playing favorites with storage vendors, especially the one that owns them. To VMware's credit, I saw much evidence that they get this imperative. VMware is democratically exposing APIs and vCenter plug-in opportunities to any storage vendor that wants to use them.
- VMware administrators will be increasingly challenged to choose between data and storage management functions that reside on the VMware platform or live within the storage environment. Larger IT environments will likely settle on a combination of both. Smaller shops may well opt to manage data and storage from the vantage point of the VMware platform.
- Storage-focused shows may no longer be able to support themselves. The nature of the storage buyer is changing. The nature of the storage environment is changing. Both are becoming more diverse and less narrowly focused on issues that only pertain to storage.
VMworld 2010 will likely be another great storage show.
NetApp's new CEO is Tom Georgens. Georgens steps in as Dan Warmenhoven, NetApp's CEO since 1994, moves on to the position of chairman of the board and a partnership development role under the direction of Georgens.
Warmenhoven's accomplishments were many, but he may be remembered most for turning the small niche-market opportunity that NAS once was as a dedicated file server attached to a LAN into the major networked storage platform NAS has become. Along the way, he built NetApp up to a 3.4 billion dollar company with 8,000-plus employees focused on storage.
Tom Georgens
(Credit: NetApp)NetApp co-founder Dave Hitz tells us that one of Warmenhoven's personal goals has been to retire at age 60. He's one year away from that milestone. Rather than continue to lead NetApp into a new phase that will be focused on scalable NAS, virtualization, and cloud computing, Warmenhoven has decided that Georgens' time has come.
Georgens' storage roots go back to the early to mid-1990s at EMC, where he was tapped to develop a midrange storage product to complement the Symmetrix line and exploit the growing Windows storage opportunity. That project was torpedoed internally, and Georgens went on to take on the storage business at LSI. EMC subsequently bought Data General, jettisoned DG's server business, but propelled Clariion to its current position of dominance in the midrange.
At LSI, Georgens surrounded himself with some very able executives who helped him establish the Engenio storage brand as the dominant OEM storage play, selling to the likes of IBM, STK, and Sun. He attempted to take Engenio public, but pulled back when both he and the executives at LSI decided that they couldn't get what they believed to be the true value of Engenio via an IPO. Not long thereafter, NetApp came calling. Georgens stepped in and later took on the position of COO, a move many analysts interpreted as one that placed him next in line for the CEO spot.
Now is a pivotal time in NetApp's history. NetApp has successfully transitioned from NAS-only to a broader range of storage and data management software products. And it is the only major independent and publicly held storage company left standing. STK was acquired by Sun. EMC has diversified to the point where it now calls itself an IT infrastructure player. That singular position in the eyes of some makes NetApp a takeover target. Here's why I think a takeover of NetApp is now less likely.
Georgens hates to lose. Selling-out now would be tantamount to losing.
How do I know? This may sound a bit odd but Georgens and I both participate in a not well-known activity called radiosport. Radiosport is practiced by ham radio operators worldwide. On certain weekends during the year, ham radio contestants try to make as many contacts with other hams in as many countries as they can during a 48-hour period. I do it because I've been a ham since my teen years and it's still fun to copy Morse code at something like 35 words per minute. Georgens probably enjoys this, too, but he's in radiosport to take all the marbles. Unlike me, Georgens is a world-class competitor. He has won numerous worldwide competitions, often from a station on the island of Barbados, and holds several North American records. In addition, he has represented the United States in the World Radiosport Team Championships.
So what, you say? Try to send and receive high-speed code for 48 hours with only occasional short breaks and maybe an hour of sleep in between. It takes dedication and an absolute desire to win to match Georgens' achievements.
Georgens didn't go to NetApp to sell the company. He went, I believe, because he wanted continue on NetApp's growth trajectory established years ago by Warmenhoven, Tom Mendoza, and Hitz. Selling would be letting someone else win. That's not in character for Georgens.
There is a big disconnect between how long people think they should be storing data and how long they actual can. One group of vendors and academics is trying to change that.
Two years ago, the Storage Networking Industry Association's Data Management Forum reported the results of a landmark study that looked at the state of long-term storage, i.e. preserving a digital object for more than 10 years. Some disturbing results jumped out.
The study suggested that we live in a digital version of the Dark Ages. I'm talking about it now because I think the messages from the study are still very relevant to both IT administrators and consumers.
A whopping 80 percent of the 276 organizations included in the study reported a need to retain electronic records for more than 50 years, so let's start there. How many of you storage administrators out there actually think you can do 50 years of electronic records retention given current technology? Without data loss? OK, so you won't be doing the same job 50 years from now, so why care? Next question: How many of you think that you can do more than three migrations of archival data from one storage media to the next without data loss? According to the study, the answer was very few of you.
Here's one for consumers: How many of you using Internet photo services sites think that your digitized images will still be there 50 years from now? You haven't thought about that, right? You and your spouse take pictures of the newborn today, you store them online, and maybe you store them at home, too. Here's a suggestion: make sure to print them and preserve the prints for as long as you can because if the enterprise-level storage administrators who have been doing digital storage for decades have little confidence in their ability to do long-term digital preservation, you shouldn't have much confidence either.
So there's a big gap here. A group of concerned vendors and academic advisers have formed the 100 Year Archive Task Force under the auspices of the Storage Networking Industry Association's Data Management Forum wants to start filling the gap. You can follow their progress or become involved yourself here.
One more result from the study still has me puzzled. Slightly more than half of the 276 organizations surveyed reported the need for "permanent" storage. What might fall into the permanent category? I thought of the Founding Fathers writing the U.S. Constitution and wondered what that process would have been like if they were all using a collaborative work-flow tool like Microsoft SharePoint. For sure, they'd print out the final version for all to see--on parchment maybe? But what about all the draft versions and messaging back and forth--in short, all the supporting documentation that clue us in on their state of mind and tell us what they really intended? Would they have printed out all of that, too? I dare say that insight would be gone forever.
We rarely, if ever, think of saving our digitized thoughts for the sake of posterity. But for the sake of historians, lawmakers, sociologists, and scientists yet to be born, we should--or people centuries from now might look back on this as the digital version of the Dark Age centuries from now.
During the last two weeks we saw two acquisitions of relatively small purveyors of scalable file systems by big storage players. First, HP finally pulled in its partner IBRIX. Only days later, LSI made a surprise acquisition of ONStor. If both IBRIX and ONStor offer platforms upon which one can build scalable network attached storage (NAS), do these back-to-back deals indicate some sort of emerging trend? Yes and no. Yes it is in that, if you're a major NAS vendor and want to compete with NetApp who is readying GX8, scalability is now a must-have. But IBRIX extends capabilities HP already has whereas for LSI, ONStor represents their first ever venture into the NAS world.
Amazing is the amount of blogosphere and Twitter chatter that was generated by HP's announcement that it intended to acquire IBRIX for an undisclosed sum. No offense IBRIX people (all 53 of you), but you're not exactly a household name. It looks like HP is about to make you one however. And yes, you deserve all the attention you are getting, finally. You had a "next-gen" parallel file system before many knew they would even need one. You knew that Big Data users needed a file system that was system-agnostic and that would scale to the petabyte range. At the time however, they were in a niche-y place called high-performance computing (HPC). Now, Big Data users are cropping up everywhere. You count AOL, Caterpillar, Dreamworks, JP Morgan Chase, and Pixar among your 175 customers. Who knows where this cloud thing will take you.
Big is a relative term. In the storage world, what is big today will be table stakes tomorrow. The Petabyte-scale file system is becoming a must have for storage vendors. NetApp bought Spinnaker a while back. Sun developed ZFS. IBM has GPFS, and HP bought PolyServe last year two years ago but has chosen to position it in the Windows SQL Server space where it gets the most traction. IBRIX, with its many performance and data management capabilities, represents a much larger market opportunity to HP. And LSI has chosen to enter the NAS market as scalable from the get go.
IBRIX is headquartered in what was once a Honeywell Bull facility in Billerica, Mass. When they appeared in 2000 with a unique parallel file system called Fusion, the question was how to bring this to market? Who buys a parallel, scalable file system when file systems normally come bundled with or embedded in something else? IBRIX answered that question by forming remarketing relationships with big names: Dell, EMC, HP, and IBM who bundled/embedded IBRIX with their servers and storage. Dell and EMC packaged Fusion with PowerEdge servers and Clarrion storage, presenting the package to high-performance computing (HPC) customers. HP embedded Fusion in HP Blade and ProLiant server racks.
So what exactly does HP have planned for IBRIX? According to HP's Paul Perez, "HP will put the U in unified storage." OK, but that's a bit cryptic. Short term, HP will keep on keepin'-on with blade server/blade storage and scalable ProLiant/IBRIX NAS implementations. Longer term we may well see HP use IBRIX to approach cloud computing and archival storage opportunities.
Unified storage with a capitol "U" is a bit more of a challenge to understand. Typically the term has been applied to disk arrays that support fiber channel and Ethernet connectivity. HP likely means that kind of unification plus something more. IBRIX is typically used by its partners to create scale-out NAS subsystems using Fusion as the software engine that powers a NAS platform consisting of industry standard servers as the NAS front end, and SAN or direct-attached (DAS) RAID storage on the backend. As such, the combination presents scalable file storage to applications but uses block-based SAN or DAS storage. NAS is typically characterized as file storage, while SAN is block storage. It's a distinction that traditionally has had many application implications and ramifications. What HP's big U for Unified message may also be signaling is the introduction of a file/block converged storage product bundled with new hardware form factors sometime in the near future. For HP that likely means some combination of HP StorageWorks SANs, ProLiant rack-mount and blade servers, and ProCurve Ethernet switches powered by Fusion.
It's interesting that HP has chosen to announce a marriage now. After all, they've been dating for at least four years. But NetApp, after fussing like forever with the scalable file system it acquired from Spinnaker, is finally ready to go mainstream with it as ONTAP GX8. IBM is making more noise about GPFS. Then there's ZFS and its new owner--Oracle.
Which brings us to LSI and ONStor. LSI's Engenio Storage Group wasn't in NAS until now. It is in RAID arrays and storage virtualization. Now it's in scale-out NAS and NAS/SAN gateways too. Why? LSI/Engenio sells exclusively through original equipment manufacturers. IBM is a major reseller as is (was?) Sun. Dell is also in the mix. IBM's DS3000, 4000, and 5000 series arrays are all originally produced by LSI/Engenio.
But there is much repositioning going on among the big IT vendors these days. The future of Sun's hardware business is still a matter of debate in spite of Larry Ellison's assurance that Oracle will sell hardware too, and Dell is on record as in the hunt for companies worth buying. Is another storage acquisition possible for them? I think so. As a result, the number of large OEMs that LSI/Engenio can sell through is not growing and the future is unclear with regard to OEM sales of traditional RAID arrays via the big names in IT.
So NAS to LSI/Engenio represents new growth and possibly substantial growth if they can compete effectively. HP's acquisition of IBRIX potentially leaves something of a hole in the scale out NAS product lines of Dell, IBM, and EMC that LSI/Engenio. Dell and SGI may also need NAS/SAN gateways. IBM might like to have a second source for the NAS boxes they get from NetApp because ONStor is both scale out and scale up. And let's not count out Sun/Oracle either. Whereas IBRIX had established prominence in HPC computing, ONStor went after more mainstream applications and could be a better fit in that space for the big OEM partners.
So put LSI/Engenio on the list of buyers looking for things to buy as opposed to the other way around. They didn't just suddenly decide to write a $25 million check to the owners of ONStor after learning of HP/IBRIX. Their executives have assured me that they've been looking to add NAS to the portfolio for months. Market conditions and a desire to make a "we're here to stay" statement drove the timing of the ONStor deal.
Suddenly a somewhat dormant space in the storage world is erupting with activity. Why? The Big Data apps are here and they generate big system opportunities as well as big Unified storage opportunities.
IBM's storage group is now in the habit of making smorgasbord announcements. They'll take a look at their storage lineup--one that includes everything from SSD to tape, storage-related software and services--select the new stuff going on within each product development cycle they think is significant and therefore want to publicize, then bundle all of these separate announcements up in a wrapper ("Information Infrastructure" begets "Smart Planet")--and step up to the microphone.
And so it is with IBM's most recent storage table selection. They're now offering replication and deduplication for ProtecTIER, faster hardware and SSD support SAN Volume Controller, Thin Provisioning for DS8000 arrays, a new version of Tivoli Storage Manager, and numerous enhancements for XIV storage. Don't get me wrong. I'm not trying to belittle what they're doing. I would, however, like to observe that some of the more significant and interesting things can tend to get lost in the shuffle. XIV is a case in point.
XIV could well be a piece of computer history in the making because its guiding light, when at EMC, once took on and beat IBM at its own game. XIV was founded
in 2002 and emerged in 2005 with its first product called Nextra. It is an Israeli-based start-up and the place where Moshe Yanni landed after he left EMC. Yanni, known in the storage industry by just his first name ("mo-shay"), is the father of EMC's Symmetrix/DMX, the longest running disk array product family ever. He and his team created the MOSAIC 2000 storage architecture, which allowed EMC to update Symmetrix' disk, controller, and connectivity technologies more or less independently of one another. MOSAIC 2000 helped in a big way to establish the financial foundation that supported EMC's future expansions into content management, security, and virtualization.As mentioned, XIV announced its first product, Nextra, four years ago--a next-generation disk array composed of clustered storage nodes. Shortly thereafter, XIV--and Moshe--were acquired by IBM. So here we have the father of Symmetrix, a product that allowed EMC to supplant IBM as the king of enterprise storage, now carrying the banner for the IBM storage team. Could EMC's former benefactor and acknowledged storage maven now become its biggest enterprise storage headache? Quite possibly.
By all accounts, Moshe doesn't kid around. Lore has it that he was once (and may still be) equipped with an Israeli fighter jet, and that after EMC bought the Clariion array along with the rest of Data General, which he saw as an internal competitor, he and his team built a small Symmetrix, stood it up outside his office door, and attached a sign to it that read "Clarrion killer." Lore also has it that his departure from EMC was literally an executive office glass-shattering event. In fact, there is a whole body of Moshe lore known mostly to storage industry cognoscenti--stories traded over beers. Who knows how much of it is fact? But one thing we all agree on: when the world of enterprise storage feels like a snake pit as it often does, you want Moshe on your side. Now he's on IBM's side.
IBM's acquisition of XIV raised more than a few industry eyebrows. A fiercely independent storage genius goes to work for the Big Blue marketing machine known as IBM? A clash of titans could be in the making. Well, so far so good. IBM dropped the Nextra label in favor of calling both the product and the company XIV, but has allowed XIV to field its own salesforce, as well as manage its own R&D budget and product development efforts. IBM is also promoting the XIV brand as an enterprise storage play, in spite of the fact that it also has its own internally-developed enterprise storage array line, the DS8000 series. IBM also allowed XIV to announce that it recently sold its 1,000th array, and that many of its new customers are former EMC Symmetrix/DMX customers.
Now that IBM has two enterprise disk arrays in the product portfolio, and two sales teams selling enterprise arrays to the same big systems customers, one could well wonder how IBM will differentiate going forward. Look to future announcements for clues. When the XIV acquisition was announced to storage analysts, IBM positioned XIV in "Web 2.0 storage"--that is, as something distinct from traditional data center storage where the DS8000 lived.
Well, ahem. Guys, nice try. We know a bit about Moshe and we don't think he's about to confine himself to a market subsegment. We now note that the most recent XIV announcement drops the Web 2.0 distinction and moves the DS8000 closer to the System z mainframe world--a place where XIV doesn't play because it doesn't support CKD disk formatting. It's not that XIV's engineers don't know how to do that. Moshe's Symmetrix started life as a mainframe-attached box. But there have to be some distinctions going forward. IBM mainframe customers get the DS8000 exclusively for now, but maybe not forever. And IBM is rumored to have one more DS8000 model to release later this year.
Watch future announcements for subtle shifts in messaging though as IBM will transition from DS8000 to XIV as its flagship enterprise storage array. The DS8000 is now called the "flagship mainframe array," while the XIV array has been promoted the "next generation storage" on IBM's most recent storage smorgasbord.
One more piece of Moshe lore--the name of the company XIV or the Roman numeral fourteen? What's the significance? Ask around and you may get conflicting answers. That's the nature of a legendary figure. The one I've settled on is this one: XIV stands for the fourteenth graduating class of Talpiot, an elite Israeli Defense Forces training program, of which Moshe and three other XIV executives were members.
As yet, Twitter is likely not on anyone's list of the top 10 most-critical applications. But has the U.S. government given Twitter a big push toward critical application status? This week the U.S. Department of State told Twitter it could not shut down for system maintenance because it had become a lifeline for thousands of protesters in Iran.
That should change the way IT vendors (particularly infrastructure vendors) view social-networking sites such as Twitter, Facebook, YouTube, etc.
(Credit:
Twitter)
Generally speaking, social-networking sites offer no guarantees to users. You post your content, you take your chances. And, while there is no sign at the entrance that says "Caution. You are about to enter a service-level-free zone," the sign is virtually there.
To infrastructure vendors, that spells commodity play. They look at the social-networking providers as great places to earn their Web 2.0 stripes, but tough places to make money. So, they are most likely to sell them least common denominator servers and storage--no frills, no value adds. In storage, for instance, they may only be asked to supply JBOD (just a bunch of disks) storage without RAID-based data protection, snapshots, or other quality of service enhancers. But that's OK they figure, no guarantees equals no risk.
Time out guys, there is risk. Think back 10 years ago today. eBay outages were in the news on an almost daily basis and Sun Microsystems wound up wearing the blame. Yes eBay was charting territory in a brave new world and therefore offered no service or availability guarantees. And yes there were more vendors in the mix at the time (Oracle and Veritas to name two). But the outages were very visible and Sun's image suffered disproportionately. While not explicitly stated, eBay users nonetheless had an implicit expectation of quality of service from eBay, a level that was never formally agreed to, but understood and expected.
Fast forward 10 years. Twitter is in uncharted territory, too. The temporary and periodic "system busy" messages are tolerated by users, but not without complaint. Jokes about Twitter's Fail Whale are common. Hey, it's not a critical app. We're all just having fun here, right? However, the elections in Iran have changed that perception. Twitter and other social-networking sites have become windows on a pivotal event with worldwide implications. The world wants to watch. Indeed, what the State Department's request says is that the whole world needs to watch. As a infrastructure vendor in this new and uncharted environment, do you now want to be blamed for an outage? For data loss? For a security breach?
This all adds up to the Twitter Conundrum. The owners of Twitter and other social-networking sites aren't likely to buy highly available, highly secure, redundant systems and storage of the type common to 24 by 7 production data centers. Their business models simply won't support big enterprise gear. But does that stop the federal government from stepping in and saying "sorry, you can't go down right now, not even for a few hours?" No. Twitter, YouTube, and FaceBook have created windows on the world, windows that could in fact change the world for the better. You can't fail (whale).
Here's the conundrum: No one presently pays a fee for posting to these sites. You get what you pay for or, in this case, you don't get what you don't pay for. You don't pay for and therefore don't get guaranteed availability or data integrity. Is the federal government now willing to subsidize Twitter so that it can function like a production data center? Probably not. Are users willing to pay a fee to get a guaranteed level of service? Again, probably not, at least not in the near future.
Owners of the social-networking sites have managed this conundrum by rolling their own. They get cheap, or even better, free infrastructure and make it work. The power implicit in what they do with the scarcest of resources is truly awesome. Now, as they're sites become embedded in the fabric of society, can they keep that model going? Perhaps, but they will likely need our help. Remember, e-mail was once a frivolous application.
Much drama has ensued since NetApp announced the intended acquisition of Data Domain on May 20 for the whopping sum of $1.5 billion.
EMC countered with a $30-per-share offer valued at $1.8 billion. NetApp then raised its offer to $30 a share, valued at $1.9 billion. Data Domain essentially said, "Thank you, EMC, but we like the new NetApp offer more than yours." EMC then claimed that it had been unfairly shut out of the bidding process and appealed directly to Data Domain employees.
NetApp countered with a claim that EMC's potential acquisition of Data Domain would fail a federal regulatory review, a claim that EMC has rebutted as it considers shoveling more cash into the fire to make its proposal more attractive.
To its suitors, Data Domain is now reportedly worth $1.9 billion. To give you some perspective on that figure, Oracle recently agreed to acquire Sun Microsystems for $7.4 billion. A $1.9 billion acquisition would mean that Data Domain is now worth about 24 percent of that number, yet its 2008 revenues of $274 million are a tiny fraction of the $13 billion Sun took in sales revenue during 2008. Here's another relevant data point: EMC acquired VMware for a mere $635 million.
Deduplication is the storage world's new killer app. It's the great shrinking machine. Think of the old Steve Martin "let's get small" routine. It shrinks big data down to a small fraction of its original size--way more than is possible with the more common data compression routines. Why is that process now worth billions of dollars?
Most IT shops are moving away from using tape as their primary backup media in favor of disks. Deduping makes this migration economically viable by greatly reducing the backup data footprint on disk arrays by factor of 20 to 1, on average. You can't do that with tape. Nor can you get the input/output performance of disks from tape.
But that's not all that deduping does. It can be run against primary data storage streams to reduce the data footprint within expensive primary storage arrays. NetApp, among other vendors, supports this. Running it here may amount to the functional equivalent of buying another array, given the capacity that's saved as a result. When IT budgets are constrained, and storage is one of your top budget priorities, that's a big deal.
One can also dedupe archival storage, making the disk a repository for archival data that may need fast accessibility on a periodic basis--like when your corporate attorney needs to find exculpatory e-mails from three years ago and needs them yesterday.
So now everyone has to dedupe. Every major storage vendor, from EMC to Hewlett-Packard to IBM, now offers at least one dedupe option of the many that are now available, including the in-line and post-process variants. IBM, for example, offers four options.
In spite all its high-profile competition, Data Domain has been the acknowledged leader in integrating deduplication into the backup process. It offers disk-based deduplicated storage arrays for heterogeneous backup environments, and it leads all contenders in this space, in terms of market share, by a wide margin.
Does a leading position in a killer app justify a $1.9 billion valuation for a relatively unknown company mining a niche storage opportunity? Stay tuned. The executives at EMC and NetApp hate to lose, and EMC may yet win the heart of the fair maid named Data Domain.
- prev
- 1
- next





