After experiencing an outage that started on Sunday afternoon and stretched through most of the day yesterday, Tumblr has explained what happened.
"Yesterday afternoon, during planned maintenance that was not intended to interrupt service, an issue arose that took down a critical database cluster," Tumblr founder David Karp said on the company's blog last night. "This brought down our entire network while our engineers worked feverishly to restore these databases and bring your blogs back online."
Karp also acknowledged that the site has been experiencing some "errors" as of late, and admitted that those issues are "absolutely unacceptable."
However, the downtime Tumblr has been experiencing is a result of its success. The site is now generating over 500 million page views each month, Karp said, and "keeping up with growth has presented more work than our small team has prepared for." So far this month, Tumblr has "quadrupled" its engineering staff, and it plans to get its "infrastructure well ahead of capacity as quickly as possible."
Tumblr's troubles with success sound awfully similar to those experienced by Twitter during its own growth spurt. The company was displaying its "fail whale" far more often that users would have liked, and it was having trouble keeping up with so much content being added to the service. Over time, it has addressed those problems, and, though still present, downtime isn't as big of an issue on the service as it once was.