August 18, 2006 4:00 AM PDT

Spying an intelligent search engine

While most would agree that Google has set the current standard for Web search, some technologists say even better tools are on the horizon thanks to advances in artificial intelligence.

Search is like oxygen for many people now, and considering Google's breakthroughs in Web document analysis, supercomputing and Internet advertising, it can be easy to think this is as good as it gets. But some entrepreneurs in artificial intelligence (AI) say that Google is not the end of history. Rather, its techniques are a baseline of where we're headed next.

Proponents of AI techniques say that one day people will be able to search for the plot of a novel, or list all the politicians who said something negative about the environment in the last five years, or find out where to buy an umbrella just spotted on the street. Techniques in AI such as natural language, object recognition and statistical machine learning will begin to stoke the imagination of Web searchers once again.

"This is the beginning for the Web being at work for you in a smart way, and taking on the tedious tasks for you," said Alain Rappaport, CEO and founder of Medstory, a search engine for medical information that went into public beta in July.

"The Web and the amount of information is growing at such a pace that it's an imperative to build an intelligent system that leverages knowledge and exploits it efficiently for people," he added.

Medstory is not alone. Other young companies such as stealth start-up Powerset and image specialist Riya are also looking to turn arcane computing techniques into business success stories.

Unlearning "keywordese"
In the eyes of a search engine, the Web is essentially a body of words on billions of pages, along with the hyperlinks that connect the words. One of Google's big breakthroughs was to link those words efficiently, measuring relevance by the appearance of words on a page, and the number of hyperlinks pointing to that page, or its popularity.

As a rule, search engines don't understand the words--they're merely programmed to match keywords that are more significant on a page, closer together or linked more often from other pages. So in essence, when someone types in "woolly mammoth," he or she sends the search engine on a wild goose chase for those words, not the animal.

As a result, search engines miss the nuance of the human language. For example, Google might render a simple search for "books by children" by scouting for the pages that include the words "books" and "children," but it would eliminate the so-called stop word--in this case, "by"--because stop words are those that occur on almost every page. Yet those stop words occur so often because they are important to the meaning of a phrase. "Books by children" is different from "books about children," and different still from "children's books."

Barney Pell Barney Pell

Barney Pell, founder of a yet-to-be-launched AI search engine, calls the restrictive language of search engines "keywordese."

"Search engines try to train us to become good keyword searchers. We dumb down our intelligence so it will be natural for the computer," said Pell, whose company, Powerset, is based in Palo Alto, Calif.

"The big shift that will happen in society is that instead of moving human expressions and interactions into what's easy for the computer, we'll move computers' abilities to handle expressions that are natural for the human," he said.

Powerset, which hasn't divulged its launch date yet, is using AI to train computers not just to read words on the page, but make connections between those words and make inferences in the language. That way a search engine could think through and redefine relevance beyond the most popular page or the site with the most occurrences of keywords entered in a search box.

CONTINUED: Time is right for experimentation…
Page 1 | 2 | 3

See more CNET content tagged:
Artificial Intelligence, search engine, children, word, Google Inc.

8 comments

Join the conversation!
Add your comment
AI
Artifical Intelligence is a meaningless buzz word. A rule based system would be accurate but no where near the emotional context of artifical intelligence. Inference Engine would be just as buzzy but much more accurate. I am sure others could come up with other terms. Maybe you should hold a contest to see who comes up with the best descriptive term. I submit Inference Engine.I hear AI and I instantly think Vapor.
Posted by Drewky (2 comments )
Reply Link Flag
Face recognition has great potential
Forget finding a girl that looks like my last girlfriend and will remind me of that pain. Law enforcement searches will benefit from facial recognition. From matching a missing kid to a kiddie porn image to finding a fellon that changed his/ her identity. For law enforcement it will be an new and invaluable tool, trolling the mug shots and internet in search of possible matches to all kinds of case files.
Posted by WesFlash (19 comments )
Reply Link Flag
AI is just code word for vapor/hoax ware.
I agree with other poster, AI is just code word for vapor/hoax ware.

If I had gotten a dollar for each time I heard of AI and then nothing of sort was ever delivered I would be a rich man.
There is no such a thing as AI (Artificial Intelligence) software and there wont be any, because we have no idea of how to emulate complex intelligent thinking in software, there is only good software engineering.

So it is not AI that is going to make a better search engine to Google & Yahoo, it is innovative new search engine ideas which are implemented based on good software engineering work that it is going to do it. And I will tell you about a search engine that is better than Google or Yahoo,
it is called Anoox, and it is better based on these points:
1- It is powered by the Knowledge of the people
2- It is operated in an Open fashion, so NO one company owns/controls it but 100's of different company's from around the world will
It is here in case:
www.anoox.com
Ah, another reason it is better, it is also not-for-profit.
Posted by Sea of Cortez (67 comments )
Reply Link Flag
Goolge has pushed the limit of html search
google has pushed the limit of html search. as long as pages and sites continued to be created in a way and format that does not allow for descprtive classification any real AI will be impssoble
Posted by darmik (3 comments )
Reply Link Flag
True Natural Relevance not AI
I think AI as a search descriptor is a little oxymoronic as well. The comments here demonstrate how each person has their own contextual relationship to the keyword "AI".

I do however agree that search as a science is in its infancy and that current cpu/storage efficiencies now make it possible to deliver an events-based (usage) search architecture connected to individual users, instead of the current link-topology-connected-to-no-one system weve all learned to love and hate.

Were also not ready to throw the keywordese out the door. We believe there is a lot of natural intelligence in keyword associations (i.e, John Battelles data base of intentions) that can provide an order-of-magnitude better relevance if they can be properly distilled  whether its applied to search, feeds, or media. We think the key is making it implicit (no explicit tagging or rating) and to make sure you have 100% participation  every user is both an information consumer and an information provider for every other user in the system.
Posted by Rob at Collarity (1 comment )
Reply Link Flag
Its time to roll on the next generation Search Engine
The popularity of the internet is that it offers a variety of content not available in any other medium. A search engine assists users to locate the information.

The Internet is supposed to be for fun not serious business.

And that is why most people use research tools and offline content to achieve their results. Internet search is the last place a true researcher would go to find top rated content. Only after all other options are closed.

What if the internet offered this as the very first option and people could be sure that all the information that the internet and internet search throws up is Top rated quality stuff and it is available free/subscribed/paid. I am myself a researcher and would love to have such a tool in my hands instead of having to spend a huge amount on buying proprietary research content.

Once I finished using the content, I need not pay for it. There are many websites such as Janes.com which provide this sort of quality info.

Present Internet Search is neither intelligent nor smart. Cluster search engines such as Vivisimo and clusty.com help to some extent but stull trash out the same stuff.

In this respect, I admire the efforts of NetAlter which is bringing a radical new search engine that would offer quality content and meta information. According to NetAlter, their search engine would offer a variety of pre-search and post search tools that enable sorting, comparing and analyzing of search results and also offer a single click ecommerce connection.

Check out the NetAlter search whitepaper and presentation.

<a class="jive-link-external" href="http://www.netalter.com/Solutions/search.pdf" target="_newWindow">http://www.netalter.com/Solutions/search.pdf</a>

<a class="jive-link-external" href="http://www.netalter.com/Solutions/Search.pps" target="_newWindow">http://www.netalter.com/Solutions/Search.pps</a>
Posted by guyfrom2006 (33 comments )
Reply Link Flag
new search required
3D search idea by Micosoft is the one i find quite useful and it invloves user interaction as well... currently many search engine do give anonymous links to increase the number of pages but only some r useful.
thanx
Himanshu joshi
<a class="jive-link-external" href="http://montoojoshi.googlepages.com/onlinecompiler" target="_newWindow">http://montoojoshi.googlepages.com/onlinecompiler</a>
Posted by Himanshu_Joshi (11 comments )
Reply Link Flag
AI is already being added to search engines
People need to start following the AI bouncing ball at Google and other search innovators. The integration of the CYC taxonomy, for example, appears to be well under way with available frameworks that allow natural language and human logic to prevail over "keywordism" in the very near future.

<a class="jive-link-external" href="http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/distributedai" target="_newWindow">http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/distributedai</a>
Posted by readyforthefuture (1 comment )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.