BlackBerry outage: The day after
Update 2:15 p.m. PST: No sooner do I post this than RIM goes and issues an explanation for the outage. Read on for the details...
In the immortal words of Cinderella's Tom Kiefer, you don't know what you got, till it's gone.
Monday's widespread BlackBerry outage--the second major one in the past 12 months--left Research In Motion customers stranded and cut off from the rest of the world, sort of like what happened to the '80s glam metal band after Long Cold Winter. The Internet's equivalent of a snow day left reams of e-mail messages undelivered for about three hours Monday, according to RIM, which either still hasn't figured out exactly what caused the problem, or isn't willing to disclose the cause just yet.
Representatives for AT&T and Verizon told several media outlets Monday that from what they understood, all wireless carriers in North America that work with RIM were affected. The last time an outage of this magnitude occurred, in April, RIM blamed a database problem that snowballed when the backup "failover" process didn't work as planned.
Users of BlackBerrys such as this 8820 model couldn't get their precious, precious e-mail for about three hours yesterday.
(Credit: Research in Motion)It's amazing how dependent people have become on their mobile devices. CrackBerry addiction is an old story, but it keeps surfacing every time people are forced to go more than 10 minutes without access to their e-mail. Local television stations in San Francisco all teased the BlackBerry outage on their 11 p.m. newscasts as a near-disaster, since we don't have weather events out here to keep people watching the local news.
While coverage of the outage just goes to show how mobile devices like the BlackBerry really are becoming the next wave of personal computing, it also points out that the entire system has a single point of failure: RIM itself.
All e-mail messages sent to or from a BlackBerry in North America must at some point in their journey travel through RIM's network operations center (NOC) in Canada. The company tried to use that to its advantage in its patent dispute with NTP, noting that since such a critical part of the service lies in Canada, RIM should be exempt from U.S. patent claims. That didn't take.
The Wall Street Journal reported Tuesday that expansion efforts at RIM's NOC may have been to blame for the outage. The problem isn't that the servers are in Canada; they could be anywhere. It's just that everything has to go through the one location. In theory, as long as you have enough redundant backup systems and plans, that shouldn't be a problem. But every now and then, it is.
Frank Gilman, the chief technology officer for Los Angeles law firm Allen Matkins, was forced to deal with the outage Monday afternoon. "What surprised me was the apparent lack of a solid business continuity plan on RIM's part to ensure reasonable connectivity," he said via e-mail, of course. "A company that is marketing devices that increase the mobility of professionals should have systems and contingencies in effect to avoid an outage of that size and duration."
I'm sure that far more BlackBerry-related disasters are averted that never come to light. But RIM has an advantage over other service providers in that few people sign service-level agreements (SLAs) with RIM for the BlackBerry service. SLAs are basically promises from hosted service providers to maintain a certain level of uptime, which is usually 99.999 percent or so.
Those promises are usually only worth the paper they're printed on, however, as the process of actually accounting for and proving damages as a result of an outage can be extremely difficult. Given the degree to which many large businesses--not to mention U.S. government staffers--rely on the BlackBerry service, perhaps RIM's larger customers will start thinking about negotiating such an agreement when it comes time to renew the service.
As frustrating as the outage may have been, it's not like the U.S. economy ground to a halt Monday afternoon as millions of e-mails about sales presentations and reminding the people on the fourth floor to empty the refrigerator on alternate Fridays went undelivered.
Still, RIM still needs to come clean about what caused the problem if it wants to keep people hooked on its service. I've seen the thumb wheel and the damage done.
Update 2:15 p.m. PST: RIM sent out a statement after waiting for me to post this blog, just to make sure we could test our own update procedures.
The company is blaming "a problem with an internal data routing system within the BlackBerry service infrastructure that had been recently upgraded," according to the statement. RIM has been upgrading its capacity as demand for the BlackBerry continues to grow, and usually there isn't much of a problem during one of those upgrades. This time, something apparently went wrong.
"Once again, RIM apologizes to its customers for any inconvenience." The company said it would share further details once a more in-depth investigation is completed.
Tom Krazit writes about the ever-expanding world of Internet search, including Google, Yahoo, online advertising, and portals, as well as the evolution of mobile computing. He has written about traditional PC companies, chip manufacturers, and mobile computers, spending the last three years covering Apple. E-mail Tom. 



Is everyone else REALLY that glued to their damned BlackBerries? May I gently suggest a vacation -- one where you leave the blasted black device in a drawer at home?
Get a phone with windows mobile on it and you will find that you never have these problems again.
As an IT professional and a technology consultant I always recommend to my clients that they stay away from blackberrys. Every one of them that has not listened to my advice has had problems and has regretted the decision.
For some people, the blackberry is all they know. I suggest stepping outside and looking around at all that has changed since you became stuck in your office syncing your crapberry.
Personally I've had more problems with Windows Mobile devices Blackberrys.
If you ever leave Redmond you'll find the sun is out!
The relience on e-mail for so much business colaboration isn't a good thing.
I wonder how someone would react to an outage compared to losing or damaging the Blackberry?
I think that's a lyric from Big Yellow Taxi by Joni Mitchell.
LJP
And, as for RIM being the single point of failure...would the carrier's network qualify as well?
A BB, just like a Windows mobile device, is really an extension of the desktop...any way you look at it. Convenience...yes. Critical...shouldn't be.
As for an SLA...I'd think that would be hard to handle. Even with an SLA, you'll get outages. Let the SLA be between the carrier and RIM and then let the carrier pass out CREDITS when this sort of thing happens. [dreaming again!]
Somebody, somewhere, is planning to litigate because of it. It's the way of the world these days.
SLA....please....even with our carrier, enjoy the $1 credit. the truth is you arent paying so much per month that the credit is even worth the hastle/time of trying to call the carrier and wait on hold to get it. if we want to start pissing and moaning about something how about the complete decline of customer/technical service!
- Something is very chaming about these outages
- by fredtheviking February 13, 2008 7:54 AM PST
- Actually, I am surprised BlackBerry continues to get away with these outages. The fact people put up with it and don't move thier service else where says alot about the service they have. It must be something special indeed. But the news coverage of these events are even more charming.
- Reply to this comment
-
(18 Comments)