A serious Amazon Web Services outage has extended well into its second day, but Amazon said Friday the end is in sight for most affected customers of the cloud-computing infrastructure.
"We continue to see progress in recovering volumes, and have heard many additional customers confirm that they're recovering. Our current estimate is that the majority of volumes will be recovered over the next 5 to 6 hours," Amazon said on its AWS status dashboard at 8:49 a.m. today. Volumes are areas of Amazon's Elastic Block Storage (EBS) service that store data.
But for some customers, the news isn't so good. In some cases, Amazon has to restore data from backups made yesterday, a time-consuming process. "We anticipate that those will take longer to recover," Amazon said, without making any predictions about just how long.
AWS is a flagship example of one facet of cloud computing, a flexible collection of online computing services that can ramp up and down according to varying needs, with customers getting a flexible infrastructure and paying only for what they consume. At the same time, though, when a widely used service goes down, many suffer. In AWS' case, the problems with some services in the East Coast region laid low many Internet operations, including the Web sites of Quora, Sencha, Reddit, and FourSquare, and services that relied on Heroku.
Amazon's Elastic Compute Cloud (EC2) service, Relational Database Service, and Elastic Beanstalk service have been affected by the outage. The problem was first logged at 1:41 a.m. PT yesterday, the result of a "networking event" that triggered a cascade of other problems.
Struggling to restore the service has clearly been a taxing effort for Amazon. "The team continues to be all-hands on deck trying to add capacity to the affected Availability Zone to re-mirror stuck volumes. It's taking us longer than we anticipated to add capacity to this fleet," Amazon said late last night.