• On BNET: Vote: How will Apple blow it?
June 30, 2008 10:14 AM PDT

How Facebook stays afloat adding 250,000 users per day

by Dan Farber

A few weeks ago I talked with Jonathan Heiliger, vice president of technical operations at Facebook, about the challenge of innovating quickly and building stable infrastructure while 250,000 new members are added to the social network every day. Check out the video on ZDNet.

Jonathan Heiliger

(Credit: CNET News)

Q: You've been at Facebook, I think, for about a year and it's been quite a ride I guess, scaling up from zero in 2004 to over 80 million today. How do you keep up with that hyper growth?
Heiliger: You're absolutely right--we've had a lot of growth. We add over 250,000 users every day, and that means a lot of infrastructure, a lot of servers, and constantly looking at new processes and looking at how we're doing things and ensuring that we're doing things the most efficient way possible, not just for delivering all the content to our users but to stay on top of what it costs to run the site.

How do you stay on top of the cost in terms of the kind of equipment you buy and how you work with the vendors? How do you prioritize those things?
Heiliger: One of the things we recently did was we ran an RFP process for the servers we buy from vendors and essentially did a bake-off with a number of different people looking at building servers on our own. What we concluded from that process was to continue to buy servers from a couple of major OEMs (original equipment manufacturers), but through that process we were able to lock in prices today and carry those prices forward as all the commodity components costs drop.

When you're buying those servers, and I assume you're doing just a huge scale out of commodity servers, what do they look like? How are they configured?
Heiliger: We're pretty lucky in that we run a wide variety of applications, literally tens of applications on our own and hundreds of applications for our platform developers that use Facebook as a distribution mechanism, as a way of interacting with their users. But one of the reasons we're very lucky is our engineering team has selected to use PHP as the primary development language. That allows us to use a fairly generic server type. So we, with a couple of exceptions, have three main server types and run a fairly homogeneous environment, which allows us to then consolidate our buying power.

You're different from Google in the kinds of applications that you run. They are mostly running search queries, and you're running all kinds of queries and bringing back all kinds of data from the social graph. How is it different in terms of the way you build out your data center from the inside?
Heiliger: Google has a tremendous amount of information that they index and archive and present to users, but fundamentally if you go to Google and type in a search for a "tiger" and I go to Google and type in a search for a "tiger" we're going to see generally the same results, so they're presenting that same information to both of us. Facebook is a little different in that the context for our data is all social. When you look at your friends and their status updates and their photos and the notes they may have written, you're going to see one set of data versus if I look at my friends and their photos and their notes and status updates, and those tend to be non-intersecting sets of data.

So it's much more dynamic?
Heiliger: Much more dynamic data set--and what that means is it's caused us to do a bunch of different things relative to caching and relative to federating all of that data up amongst thousands of different databases so that as a user requests all of that information we're not using one particular server every time for different data.

You recently introduced a chat application on Facebook, and it seems like it took a lot of time to test it to make sure it could scale having all those simultaneous conversations going on. Could you give us a little background and color on how that came to be?
Heiliger: Chat is actually one of our most recent launches. It started as a hack-a-thon project, which is one of the things we do about every other month. People get together and work all night and pick a project they don't have time to do necessarily during the day. From the time it really germinated as an idea to the time it launched and was available for our entire user base, it became a more formal development project. One of the things we did as part of that was actually built a new back-end service to be able to deal with all of the millions of simultaneous connections that we persist for users.

One other thing I was reading up on some of the work you've been doing--you say that clouds don't solve single points of failure in your stack. What are those single points of failure?
Heiliger: Interesting question, and the notion you are referring to there was part of the talk I give in regards to cloud computing is just a panacea, and for a start-up or even a more mature start-up like Facebook, isn't the answer to solving failure points in an application. By that I mean the underlining infrastructure that powers an application is typically the result of, or the outcome of, how the application is originally designed and how users interact with that application. If an application is poorly designed or designed to constantly reference a single set of data, the underlining infrastructure is going to be the victim of that. Guys like myself in the infrastructure world have to figure out how to best make that work.

As someone who is in operations how much impact do you have on the application development to make sure that once it gets into the data center that it can work properly and scale and not have the kind of failures we're seeing with some of the new applications?
Heiliger: I think it's a constant challenge in any organization, particularly a fast-moving one like Facebook, where we want to iterate quickly and get product out in our customers' hands so we can get feedback on that product and continue to tweak and enhance it over time. We have one force that's moving in that direction, and we have another force that says we want to keep the site up, we want the site to be reliable, and we want the site to be fast.

So there's a fine balancing act, where everyone in management and everyone in both the engineering and operations department constantly just sort of works, interacts, and goes back and forth, figures out just how to make those trade-offs. Sometimes we err too aggressively on the side of innovation and iteration, and put things out on the site in perhaps a small quantity that may break the site or cause the site to slow temporarily. Other times we air on the side of conservatism, of not releasing new functionality or new features, and that then delays the sort of user gratification of having that feature or fixing that bug.

What are the challenges that you see--let's say you're at 80 million unique users per month, 250,000 being added per day and 50,000 transactions per second. What happens when you get to 500 million or a billion if you ever get there?
Heiliger: Hopefully, tremendous things. I think we can only look forward to those days.

But what are some of the bottlenecks or barriers you have to overcome to get to that kind of scale?
Heiliger: Some of the bottlenecks we're facing are how we scale this extremely distributed set of data. One of the challenges we have is figuring out how to make that replicated such that it can exist in multiple places around the world and we don't also have to bring users back to the U.S. or back to one of our data centers. I think it's a challenge that most Web sites tend to face as they scale, which is you start in one location with a single database and then you have to figure out how to grow from there, primarily driven by the amount of latency or the amount of time it takes to reach the site and interact with the site. Being able to replicate the data across multiple data centers and across multiple geographies allows users to not just read their data from a local version but write that data as well. That is one of our key challenges over the next 12 months.

As you learn more about building up this very large scale infrastructure do you ever see the possibility that a Facebook could be a service provider?
Heiliger: What do you mean by service provider?

In the sense that right now you're just running the Facebook application but what if a developer or user wanted to do something similar to what Amazon is doing, using your infrastructure to run their applications in the cloud?
Heiliger: Gotcha. So one of the values of Facebook is the Facebook platform. We have over 100,000 developers and several hundred applications that have over a million users using them. We've talked about perhaps opening up or further opening up the platform by offering compute power for those application developers. One of the steps we've already taken to improve that development environment and improve the experience for our developers is just to open-source our platform, which we announced just a couple of weeks ago as well.

Dan Farber is editor in chief of CBS Interactive News, which includes CBSNews.com and CNET News. He has more than 25 years of experience as an editor and journalist covering technology. E-mail Dan.
Recent posts from Outside the Lines
Track business executives' tweets with ExecTweets
Wolfram Alpha: Next major search breakthrough?
Microsoft's Live Mesh top innovation at the Crunchies
Macintosh at 25: Still the innovation leader
Print news is fading, but the content lives on
More speculation on Yahoo's CEO choices
Google's 2008 Zeitgeist lists of most popular searches
The information flow from Mumbai
Add a Comment (Log in or register) (6 Comments)
  • prev
  • 1
  • next
by hunter_jc June 30, 2008 12:22 PM PDT
Facebook is pretty crappy. I doubt the company has any QA in it. They never fixed how to list the result of people browsed in the groups or results. Using the website to search the network is a hell of a trouble if you don't use that feature every single day. I don't get it. I seriously wonder how it get all those VC money.
Reply to this comment
by jeffhesser June 30, 2008 12:42 PM PDT
ya.... i mean how can a company adding 250,000 users a day find people willing to invest? Too bad the true geniuses like yourself aren't in charge of handling all that cash! that 'crappy' web site will probably be gone tomorrow. No point in spending any time using it or understanding it. let's just keep our blinders on and toss a few stones. Hope that works out for you.
by deathstar2000 June 30, 2008 1:54 PM PDT
Facebook is extremely popular with college age individuals and recent grads. I'm in my thirties and everyone I know that signed up and took it for a spin wondered what the big deal was and moved on. The average American has no use for Facebook.
Facebook is simply another Friendster with better marketing hype. Mr. "I"m the CEO, *****" should take his money and retire. Let Microsoft buy it, waste money on it for a few years (ala eBay and skype) and then pull the plug.
Reply to this comment
by billybob75 June 30, 2008 4:04 PM PDT
The big question is where will they be in 5-10 years when those college kids become working professionals. Will they still be on Facebook or will they move to a business networking site like Linkedin or Schmoozii?
Reply to this comment
by honorable1 June 30, 2008 4:29 PM PDT
Is it just me or do Mr. Jonathan Heiliger's comments seem rambling , disjointed and repetitive? It could be incorrect quotes but I assume CNET is quoting directly and without error.

I'm sorry but Mr. Heiliger looks like, and sounds like, he's on Cocaine??? Anyone care to comment on this analysis?
Reply to this comment
by alan.white1 July 2, 2008 8:15 AM PDT
So THAT'S how he does it! Thanks for the laugh. :)
(6 Comments)
  • prev
  • 1
  • next
advertisement

FAQ: Buying the right Windows 7 upgrade

Readers still have lots of questions on just which version of the software they need to buy in order to upgrade their PC. CNET News tries to offer some answers.

N.Y. lawsuit details Intel's 'largesse' toward Dell

Attorney General Andrew Cuomo's federal antitrust case filed Wednesday alleges a longstanding symbiotic relationship between Intel and Dell.

About Outside the Lines

Dan Farber is the editor in chief of CNET News. He has covered technology for more than two decades, and he previously served as editor in chief of ZDNet, PC Week and MacWeek. Outside the Lines explores the intersection of business and technology.

Add this feed to your online news reader

Outside the Lines topics

Subscribe to the EIC² podcast

Editors Dan Farber of News.com and Larry Dignan of ZDNet, square off in EIC² in this weekly podcast. The two editor in chiefs talk about the big tech stories of the day and provide insight and analysis.

Subscribe to this podcast using an RSS reader other than iTunes

Subscribe to this podcast using iTunes

advertisement
advertisement

Inside CNET News

Scroll Left Scroll Right