Before too long, expect to find anything that anyone puts on the Internet on Google within seconds: with luck, it might even be useful.
Real-time search has come to Google. The company has been hinting at this day for several months, most recently when it announced a deal to access Twitter's "firehose" of data. But it presented its vision for real-time search before the media Monday at the Computer History Museum, claiming to have made a little history on its own.
Over the next few days, Google users will start to notice a box called "Latest results" on the main search results page for a topic that's guaranteed to produce results. Google used "Obama" as its example, and searches for that query place a new box that automatically scrolls through recent "real-time" results associated with that topic from sources like Twitter, FriendFeed, and Google News, as well as new Web pages--such as this story--as they are created.
The concept is hot in the search world: Microsoft's Bing also displays updates from Twitter and various blogs, although those results are not integrated with the main page. And Yahoo has also signed up with a company called OneRiot to throw its hat into the real-time search wars.
What's less clear, however, is how useful this technology will be unless Google and others working on the problem can bring the same degree of relevance and trust to real-time results that it brings to regular search results. Google News can already confuse the casual user who wonders how and why those particular headlines were singled out, so how will relevancy work when a stream of news can knock a particularly authoritative result off your screen in seconds?
"It's a very hard problem. Language understanding is still an unsolved problem," said Amit Singhal, a Google Fellow and one of the key players in developing this product. "Not only do we have to understand what someone is saying, but we have to get to the deeper semantics of what is indeed true. We have to work through many issues. Truth ends up being a rather vague notion."
In a way, this challenge is right up Google's alley. The company is obsessed with speed when it comes to presenting results, agonizing over whether design changes that add tenths of seconds to page-loading times are worth the effort.
And now that seemingly everyone has a blog, a microblog, a social-networking profile, and commenting identity (or 29), new content on the Internet is being generated at an astounding pace. Google used to think it would be able to index all the world's information in about 300 years, but CEO Eric Schmidt told CNET in November that one of Google's greatest challenges in the decades ahead will be staying abreast of the explosion in content enabled by social media.
That's why it's a bit surprising that Google, the world's leading search engine by a wide margin, hasn't necessarily been a leader in this area. Marissa Mayer, vice president of search and user experience at Google, admitted Monday the company could have moved more quickly to organize the vast amount of data produced by services such as Twitter. Anyone who has tried to use Twitter Search knows that real-time search at the moment is like the regular Internet was 10 years ago: a blast of information that's impressive in its scope but overwhelming in its usefulness.
But what Google is trying to do is leapfrog the notion of Twitter as the vanguard of the real-time content explosion. Twitter is undeniably hot at the moment, but new Web pages are generated constantly, especially as traditional media companies move online. One need only to think back to this summer when news reports of Michael Jackson's death sent millions online looking for confirmation, staggering services such as Google and Twitter under that load.
Google said it plans to display all kinds of Internet content in its "Latest news" box. Google didn't pay Twitter an undisclosed amount of money for access to its feed for no reason, however; the speed at which real-time content is generated can be harnessed much easier if search providers such as Google have that information pushed to them, rather than having to pull it out of the Web itself.
That raises the question of just how Google will index and rank real-time results. The company needs to develop the real-time equivalent of PageRank, which evaluates Web pages by the number of other pages that are linking to that page. That's something Google "is beginning to experiment with," Mayer said in a question-and-answer session following Google's presentation.
There's definitely some way to do that, but it certainly is not a simple problem. Someone with 15,000 Twitter followers is not necessarily as authoritative in one area as they are in another, and Google will have to figure out some way to evaluate this information to make it truly useful.
Until then, however, news junkies can entertain themselves watching the Latest results section spin with updates on Tiger Woods' latest paramour or the glacial progress of Congress' attempt to pass health-care reform legislation.
In a roughly 10-second period Monday afternoon on Google's Trends page, where it is testing out the real-time service, the feed for "Pearl Harbor Day"--the second most popular trend on the Internet Monday behind the aforementioned Tiger Woods--produced a tweet about a Pearl Harbor Day poem, a news story on people who were in Pearl Harbor on December 7, 1941, and a gentleman celebrating Ruby Diner's 27th anniversary with a $2.70 Rubyburger. (He also happened to note in his tweet that it was Pearl Harbor Day.)