August 15, 2005 4:00 AM PDT
Academia's quest for the ultimate search tool
(continued from previous page)
population of Los Angeles?" But for complex queries like "What is the cheapest flight from San Francisco to London?" or "Which university has the largest computer science department?" finding answers is still like doing long division.
"This is dynamic information," Carbonell said. "You must parse the question, look for answers in multiple places and do a comparison. There are multiple steps, and we're looking at how to do it in one step and provide a trace for the user."
He said it will likely take another four of five years to build such functionality that can scale computationally for wide consumer usage and deliver the kind of efficiencies the government and Internet users expect. The universities of Texas and Pennsylvania are also exploring different approaches to the same problem.
Stanford continues in its role as a breeding ground for search projects. Since 2003, Google has purchased at least two projects hatched at Stanford--personalization search tool Kaltix and a project from Anna Patterson, a Stanford computer science research associate. Stanford associate professor Andrew Ng, among others, is working on artificial-intelligence techniques for extracting knowledge from text in a search index.
Other projects have turned into young businesses. SearchFox is a Web upstart co-founded in December by James Gibbons, a longtime Stanford professor and former dean of its School of Engineering. The privately held company has created a collaborative search engine that lets people share favorite links and create personalized search indices.
Stanford, the Massachusetts Institute of Technology and many other universities are working to solve problems presented by the library of tomorrow, which will be largely digitized. Sifting through and organizing billions of digital documents will require new search technology.
Under that umbrella, an MIT graduate student has developed a tool called Piggybank, software that plugs in to the Mozilla Foundation's Firefox Web browser. Piggybank lets people surf the Web, tag visited sites with keywords and build a local, annotated collection that can then be published to a site called the bank. Therefore, it turns into a "Semantic Web browser" so users can expand the scope of understanding around existing information on the Web.
"A generalized data archive lets you make data work together in ways you couldn't before," said MacKenzie Smith, associate director for technology in the MIT libraries.
In a demonstration of what the tool could do, Piggybank integrated data from Boston.com, a movie site and Google maps to show where coffee shops are located relative to restaurants and movie theaters. The tool also lets users save such information to a "database" record (rather than a bookmark) so that it can later be searched by its attributes or designated keywords.
MIT hopes to deploy the technology and other advances from Simile for use by faculty and students.
At Berkeley's center, Wilensky has ambitious plans to solve problems within a broader definition of search. That means analyzing and organizing diverse forms of information--anything from images and video to e-commerce--and helping people synthesize it and extract knowledge.
One major area of development will be in trust and privacy. For example, how believable is the content dug up on Google or how do you know an eBay seller is truly trustworthy?
Wilensky said his group has proved that on average, eBay seller ratings are skewed based on what's called retaliatory ratings in which people slam those who slam them. Others with black marks will disappear only to re-emerge later with a clean slate. As a result, Wilensky said, his team has built an algorithm called "EM trust" (for expectation maximization) using a statistical model for rating how honest an online seller may or may not be. That development might be applied to Web sites as well.
The center will be modeled after Berkeley's Wireless Research Center in downtown Berkeley, which enjoys the backing of big mobile companies. It will include such faculty as Jitendra Malik, professor and chair of U.C. Berkeley's Department of Electrical Engineering, and David Forsyth, professor of computer sciences, who are both working on computer-vision research.
3 commentsJoin the conversation! Add your comment