January 20, 2003 4:00 AM PST

IBM aims to get smart about AI

In the coming months, IBM will unveil technology that it believes will vastly improve the way computers access and use data by unifying the different schools of thought surrounding artificial intelligence.
Learn more about artificial intelligence

The Unstructured Information Management Architecture (UIMA) is an XML-based data retrieval architecture under development at IBM. UIMA will greatly expand and enhance the retrieval techniques underlying databases, said Alfred Spector, vice president of services and software at IBM's Research division.

UIMA "is something that becomes part of a database, or, more likely, something that databases access," he said. "You can sense things almost all the time. You can effect change in automated or human systems much more."

Once incorporated into systems, UIMA could allow cars to obtain and display real-time data on traffic conditions and on average auto speeds on freeways, or it could let factories regulate their own fuel consumption and optimally schedule activities. Automated language translation and natural language processing also would become feasible.

The theory underlying UIMA is the Combination Hypothesis, which states that statistical machine learning--the sort of data-ranking intelligence behind search site Google--syntactical artificial intelligence, and other techniques can be married in the relatively near future.

"If we apply in parallel the techniques that different artificial intelligence schools have been proponents of, we will achieve a multiplicative reduction in error rates," Spector said. "We're beginning to apply the Combination Hypothesis, and that is going to happen a lot this year. I think you will begin to see this rolling out in technologies that people use over the next few years. It isn't that far away.

"There is more progress in this happening than has happened, despite the fact that the Nasdaq is off its peak," he added.

The results of current, major UIMA experiments will be disclosed to analysts around March, with public disclosures to follow, sources at IBM said.

Although it's been alternately touted and debunked, the era of functional artificial intelligence may be dawning. For one thing, the processing power and data-storage capabilities required for thinking machines are now coming into existence.

Researchers also have refined more acutely the algorithms and concepts behind artificially intelligent software.

Additionally, the explosive growth of the Internet has created a need for machines that can function relatively autonomously. In the future, both businesses and individuals simply will own far more computers than they can manage--spitting out more data than people will be able to mentally absorb on their own. The types of data on the Net--audio, text, visual--will also continue to grow.

XML, meanwhile, provides an easy way to share and classify data, which makes it easier to apply intelligence technology into the computing environment. "The database industry will undergo more change in the next three years than it has in the last 20 due to the emergence of XML," Spector said.

A new order
Artificial intelligence in a sense will function like a filter. Sensors will gather data from the outside world and send it to a computer, which in turn will issue the appropriate actions, alerting its human owners only when necessary.

When it comes to Web searching, humans will make a query, and computers will help them refine it so that only the relevant data, rather than 14 pages of potential Web sites, match.

IBM's approach to artificial intelligence has been decidedly agnostic. There are roughly two basic schools of thought in artificial intelligence. Statistical learning advocates believe that the best guide for thinking machines is memory.

Based in part on the mathematical theories of 18th century clergyman Thomas Bayes, statistical theory essentially states that the future, or current events, can be identified by what occurred in the past. Google search results, for example, are laundry lists of sites other individuals examined after posing similar queries ranked in a hierarchy. Voice-recognition applications work under the same principle.

By contrast, rules-based intelligence advocates, broken down into syntactical and grammatical schools of thought, believe that machines work better when more aware of context.

A search for "Italian Pet Rock" on a statistically intelligent search engine, for example, might return sites about the 1970s novelty. A rules-based application, by contrast, might realize you mistyped the Italian poet Petrarch. A Google search on UIMA turned up the Ukrainian Institute of Modern Art as the first selection.

"The combination of grammatical, statistical, advanced statistical (and) semantics will probably be needed to do this, but you can't do it without a common architecture," Spector said. Thinking in humans, after all, isn't completely understood.

"It's not exactly clear how children learn. I'm convinced it's statistically initially, but then at a certain point you will see...it is not just statistical," he said. "They are reasoning. It's remarkable."

2 comments

Join the conversation!
Add your comment
AI a little oversimplified
Saying that AI is divided into only 2 schools of thought is a little too limited. Logic vs. Probability is just one of the debates. This article forgets to mention the AI sect that is focusing on adaptive (evolutionary) techniques that models thinking as a process that evolves (learning is modeled as a form of adaption). The article doesn't recognize the less publicized, but growing application of Systems mathematics to model learning. AI is a huge field right now (merging with Neurology in many places like USC), and is going in many directions as more and more researchers enter the field(thanks to things like Google and the success of AI algorithms - like fuzzy logic).
Posted by w1234cj (3 comments )
Reply Link Flag
Nice thoughts, but already implemented in InfoCodex.
Thanks for the interesting article. Once again IBM is giving us a great vision about the future and how unstructured information can be searched.

InfoCodex already does all this today with the help of a linguistical database and synonym and/or similarity search across 5 languages (German, French, Italian, English and Spanish). With InfoCodex you can search for a block of text in one language and it will find you all the similar documents in the other languages as well. All of this is done without one single minute of training - because of the linguistical database that contains 2.9 Mio words and terms (i.e. "European Court of Justice" or "The President of the United States" are terms and reconized as such).

See the following links:

<a class="jive-link-external" href="http://www.ywesee.com/pmwiki.php/Ywesee/InfoCodexProcedure" target="_newWindow">http://www.ywesee.com/pmwiki.php/Ywesee/InfoCodexProcedure</a>

<a class="jive-link-external" href="http://www.ywesee.com/uploads/Ywesee/archimag-e.pdf" target="_newWindow">http://www.ywesee.com/uploads/Ywesee/archimag-e.pdf</a>

<a class="jive-link-external" href="http://www.ywesee.com/uploads/Ywesee/Evaluationsentscheid-e.pdf" target="_newWindow">http://www.ywesee.com/uploads/Ywesee/Evaluationsentscheid-e.pdf</a>

<a class="jive-link-external" href="http://www.ywesee.com/uploads/Main/USP_e.pdf" target="_newWindow">http://www.ywesee.com/uploads/Main/USP_e.pdf</a>
Posted by zdavatz (3 comments )
Reply Link Flag
 

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot

Discussions

Shared

RSS Feeds

Add headlines from CNET News to your homepage or feedreader.