Update 2 p.m. PDT: I added more detail and examples of searches that stumped Google.
SAN FRANCISCO--Udi Manber sums up Google's core challenge with this description of people's expectations: "Here's what I say, now give me what I need."
In other words, the company must use computers to comprehend humans, said Manber, the vice president of engineering in charge of Google search, in a speech at the Gilbane Conference here Wednesday.
"Ideally, we would understand your question, we would understand all knowledge, and match the two," Manber said.
That's not possible today, though, so Google takes a shortcut: Google tries to analyze and summarize all content, extend a user's query into a summary version, and then match the two.
That sounds like a pretty long shortcut, but clearly Google has set its standards and goals very high. "We strive to answer every question, in every language, in a personalized fashion, in less than 100 milliseconds, for free," Manber said.
In Manber's view, humans are a puzzle only beginning to be unlocked. "The 20th century was about conquering nature. The 21st will be about understanding people," he said, and computing is following suit. "The largest computing clusters in operation today are doing search, e-mail, social networking."
Google starts opening up
Google is notoriously secretive about exactly how it decides which results to show in response to a particular query--a subject of high interest to companies counting on high placement or people hoping embarrassing Web pages will fade away--but the company has begun opening up. Manber promised in a blog posting in May to shed more light on search quality in coming months.
Manber shared several details about Google's search quality process in his speech. For one thing, he said, there are more than 100 "signals" the company uses to determine the order of search results. Signals can be anything from language to location to a person's previous search behavior--the latter only if the user enabled Google's search history feature that personalizes results.
He also said the company has a team of "dozens" who do nothing but analyze the quality of search results, where quality is measured by hundreds of charts. These employees support the engineers who try to improve the search results, and Google wants those engineers to experiment with new search quality methods, Manber said.
"The basic idea is to remove friction from engineers...An engineer with an idea does not ask for permission," he said. Instead, the engineer tries the experiment, and Google meets once or twice a week to judge by the data whether the changes should be incorporated into Google's main search results.
These experiments take place on a dedicated cluster of servers, Manber said.
"My group at Google has at its disposal many thousands of machines, with storage measured in petabytes," Manber said. "This is just for our own use, not for satisfying your queries."
Google also tests search algorithm changes on users, different groups of whom receive different search results through a comparison process called split A/B testing.
The end result: Google adopts search changes quickly and frequently. Google made 450 search algorithm changes in 2007, for example.
"We opened the way for any engineer to go improve things. Mostly because it's based on data," Manber said. "There is no separation of research and development. Everyone does both."
Tough nuts to crack
Manber appears to take a perverse pleasure in difficult searches, relishing the fact that expectations for search match the rising capability and size of Google's infrastructure.
He cited as examples out a series of searches whose intent generally seemed clear enough to a human: southeast utah news-airplane crash 10/25/06, hairstyles for ears that stick out, inflammation and pain under my rib, what is answer to this math problem 6x/10x, how many calories in a pound, if real number show else error blank excel.
Of that collection, Google only provided good answers to the inflamed rib query, he said.
Straightforward queries also can be tricky. Google uses context to gauge what exactly "GM" stands for General Motors in the query "GM cars" but genetically modified in the query "GM foods."
Google offers various advanced search options, but its general policy is to use its single search box for everything.
"We have to understand as much as we can user intent and give them the answer they need," Manber said.