The Google Apps status dashboard indicates which services are working. (Click to enlarge.)
(Credit: screenshot by Stephen Shankland/CNET Networks)The day after a 2.5-hour Gmail outage, Google has launched a promised Google Apps status dashboard to better communicate with customers whether their online applications are up and running.
When a needed service fails, people can be mollified--and can better plan what to do--if they hear what's going on and, what went wrong, and when the service will return. To this end, sites such as SalesForce.com and Amazon Web Services offer dashboards that show how well their services are functioning. Now Google has followed suit.
"The Google Apps Status Dashboard represents an additional layer of transparency that we believe will be particularly useful for our business users, and it's also relevant to users of our consumer products," said Tessa Prescott of the Google Apps sales team in a blog post Wednesday. "Customers can use this Status Dashboard to check on the current service status of individual services such as Gmail, Google Calendar, Google Talk, Google Docs, Google Sites and Google Video for business."
In the case of the recent Gmail outage, Google offers information about when the problem was discovered, the status of its repair, and a detailed postmortem of what went wrong.
If your business used Gmail and the service went out for two and a half hours, do you think you lost $2.05 per user in productivity?
That's the monetary equivalent of what Google offered to compensate Google Apps Premier Edition customers after Gmail was unavailable for about two and a half hours on Tuesday. And it was being generous: All it had to offer was the equivalent of 41 cents per user.
(Credit:
CNET Networks / Josh Lowensohn)
For customers who pay the $50 per user per year price for the Google Apps service, Google strives to keep it up and running 99.9 percent of the time each month. According to the Google Apps service level agreement (SLA), Google promises three extra days of service if availability slips down to the 99 to 99.9 percent range.
According to a Gmail outage blog posting by Gmail site reliability manager Acacio Cruz, the outage lasted "approximately two and a half hours." By my math, assuming there were no other outages in February, that means uptime of 99.63 percent for the month.
However, Google decided to extend affected customers' service more than the 3 days the SLA required. "Given the extent of the outage and as a gesture of goodwill, we are extending their service for 15-days," spokesman Andrew Kovacs said in a statement. Ordinarily the service has to slip below 95 percent uptime to provide a 15-day extension.
So how does that math work out exactly? Well, at $50 a year, Google charges a rate of 0.57 cents per hour. So a three-day extension is the equivalent of 41 cents of revenue per user, and a 15-day extension is worth $2.05.
Before you judge, bear in mind some of the factors at play--how essential e-mail is to a company, how common Gmail outages actually are, the time of day of the outage, whether e-mail was available through other software such as Outlook even though Gmail's Web interface was down. And another relevant comparison is how reliable your own company's e-mail servers are. You're in effect valuing your employees' e-mail productivity lower than Google does if you have worse uptime than Gmail.
However, whenever Google is apologizing for outages, the company takes pains to mention it feels the pain acutely given how the company uses Gmail internally. My suspicion is that the company values its employees' time a bit more highly than what it grants Google Apps Premier customers.
Google also offered an explanation of what happened on another blog post.
"This morning, there was a routine maintenance event in one of our European data centers. This typically causes no disruption because accounts are simply served out of another data center," Cruz said. "Unexpected side effects of some new code that tries to keep data geographically close to its owner caused another data center in Europe to become overloaded, and that caused cascading problems from one data center to another. It took us about an hour to get it all back under control."
Correction, 4:05 p.m. PST: The name of the senior product manager for Google Apps was misspelled. It is Rajen Sheth. Also, Pingdom had an incorrect number for total downtime in its "more likely" scenario. It is 55 minutes.
Pingdom argues Google can get away with more outages because smaller ones fall between the service level agreement gaps.
(Credit: Pingdom)Pingdom, a company that monitors Web site availability, has concluded that Google gives itself a lot of wiggle room in its service level agreement for its Google Apps service.
The service level agreement (SLA) gives credit to paying customers if the service falls short of promised availability--99.9 percent measured monthly for Google Apps. Pingdom points out that because Google only counts downtime periods that last at least 10 minutes, the company could get away with intermittent problems that are shorter.
"What if Google Apps was down for 9 minutes, up for 1 minute, down 9 minutes, etc.? That would mean 54 minutes of downtime each hour, but Google still wouldn't count it because none of the individual downtimes lasted 10 minutes (or) more," according to a blog entry Thursday. In a "more likely" scenario with outages lasting 3, 8, 12, 5, 9, 14, and 4 minutes, the total of 55 minutes of actual downtime would only be counted as 26 minutes for purposes of the SLA.
Google, while concerned about uptime, isn't as concerned about the SLA terms or what it called Pingdom's "hypothetical scenario," though.
"If you look at our SLA and compare to others' in the industry, it's identical," said Rajen Sheth, senior product manager for Google Apps, pointing as an example to Microsoft's hosted Exchange service. Service providers need to set a threshold somewhere "to distinguish between a real outage and intermittent errors," he said, and Google is trying to be transparent about where it sets its.
That may sound like dodging the question about an accumulation of small outages, but the company does have a point that a blip probably shouldn't count as much as a catastrophe. Realistically, shortening the interval would probably squeeze Google on the other end to lower its 99.9 percent uptime commitment or perhaps raise its $50 per user per year price. There's no free lunch here for customers.
And after all, although SLAs are important, customers will rapidly abandon ship if a service breaks, credit or no credit.
Notably, Google monitors not only each customer account's uptime, but also each user of that account. It also gives credits even if only part of the service goes down while other parts are available, Sheth said. And though only some customers were affected by a significant Gmail outage in August, Google offered SLA credits to all Google Apps customers.
Google has promised a better dashboard to inform customers about outages. "During the times when we've seen outages, the No. 1 thing we need to do is communicate with our customers," Sheth said.
Web application specialist Zoho has joined the growing ranks of companies willing to share detailed information on how well their online services are holding up.
This move toward transparency is increasingly important as potential customers consider relying on such services.
The Zoho Status page shows summary and more detailed information about the availability of its Web-based services for e-mail, word processing, spreadsheets, invoices, meetings, and other applications. Clicking a "more" button shows how the service performed in recent days.
Publishing the performance measurements for online services is catching on as cloud computing grows more serious. Going hand in hand with that is offering service level agreements (SLAs) with specific uptime commitments.
Zoho competes with Google Apps, Yahoo Zimbra, and other services. Meanwhile, Microsoft is bringing Office online. Zoho is a subsidiary of AdventNet.
(Via Mashable)
- prev
- 1
- next




