April 23, 2004 4:00 AM PDT
IBM expands search push with Masala
The computing giant, based in Armonk, N.Y., is gearing up to release Masala, a new version of its DB2 Information Integrator software that will let corporate employees retrieve information from databases, applications and the Web at the same time. Subsequent improvements will include a data-mining component code-named Criollo.
Although the market for this type of middleware is small, interest is strong. About 1,300 IBM customers use the first version of Information Integrator, which came out almost a year ago, said Nelson Mattos, director of the Information Integration Software Group at IBM. Only 120 were involved in the beta program, which ran from July 2002 to June 2003.
IBM is gearing up to release software that will let corporate employees retrieve information from databases, applications and the Web at the same time.
Several large and small companies are devising technology to help people retrieve foreign-language documents, 3D and 2D drawings, old e-mails and other hard-to-find material from the nether regions of their hard drives.
"Tools like this in general are not going to blow away relational databases and their ilk anytime soon, but IBM has done a good job of tapping into other products," said Steve O'Grady, an analyst at RedMonk. "When you look at the Microsofts and Oracles of the world, they are generally still talking a centralized message."
Microsoft, though, isn't standing still. It is working on its own distributed search plan with Longhorn and a new release of its SQL Server database, code-named Yukon, and plans to build its own Internet search service. BEA Systems and others are working on similar technology.
IBM has released a beta version of Masala's natural-language search component to its customers, and will release a full beta in the first part of May. The full beta has been given to only a few select customers. Commercial availability will begin in the second half of the year.
Keeping track of necessary data "is a major pain point. People have information all over the place," Mattos said.
The rapid adoption of Information Integrator prompted IBM to increase the budget by 50 percent, Mattos said. Further, in February, IBM created the Information Integration Leadership Council. The group of 12 customers, including Kawasaki and Merrill Lynch, is testing the full Masala beta and advising IBM.
Information Integrator is a software layer than can pull data from different software--Oracle databases, Microsoft Excel, IBM's own DB2 and Lotus databases--with a single query. IBM and other companies are touting this "federated" database approach, in which searches tap into spread-out data sources, as a potentially cheaper alternative to shipping and storing large amounts of information in a single database.
The company's massive research arm is delving deeper into the science of information retrieval with projects like WebFountain and the Institute for Search and Text Analytics, an internal think tank dedicated to studying data warehousing, query structure, and the inevitable crossover between database technology and search.
Masala improves on the first version by expanding the sources of data. Natural language queries will also be possible.
IBM is looking at adding further elements. The company is tinkering with Criollo to perform data mining-type experiments. Executives who want to compare manufacturing activity with sales output will be able to do so with more current results, Mattos said.
Future versions of Information Integrator will function more intelligently. The system will monitor how corporate users interact with their data and store it, Mattos said, and then provide recommendations about consolidating databases, caching or synchronizing based on observed behavior.
"There is so much redundant data that sometimes there is data so old that you shouldn't use it," Mattos said.
So far, Information Integrator has paid dividends to some customers. Kawasaki was able to create a system with which dealers could search the entire dealer network for spare parts, cutting repair times. A Chinese insurer used it to merge customer databases collected through acquisitions.
Another company used the system to study how it was exploiting its software and hardware base. The company had two identical machines to handle particularly large workloads. It found that one machine was used 90 percent of the time.
"They were investing in a bunch of hardware and software they weren't really using," Mattos said.
These initial payoffs encourage IBM. With the full beta being released next month, the company expects customers to uncover more.
CNET News.com's Martin LaMonica and Mike Ricciuti contributed to this report.