August 22, 2005 6:00 PM PDT
Google dominates in machine translation tests
- Related Stories
Interpreting searchMarch 26, 2004
Google scored the highest in Arabic-to-English and Chinese-to-English translation tests conducted by the National Institute of Science and Technology. Each test consisted of translating 100 articles from Agence France Presse and the Xinhua News Agency dated from Dec. 1, 2004, to Jan. 24, 2005. The results were posted earlier this month.
Although computerized translations historically have read more like broken English, increased processing power and larger data samples have allowed researchers to improve the accuracy of these systems.
Start-up Language Weaver, for instance, has created software that can translate Al Jazeera broadcasts. Research on the topic is also being tackled at Carnegie Mellon's Language Technology Institute and other universities. (Neither Language Weaver nor Carnegie Mellon took part in the recent test.)
Google's machine translation wasn't perfect, but it was well ahead of the competition. On a scale from zero to one, the company's software scored 0.5137 on the Arabic tests and 0.3531 on the Chinese tests. The University of Southern California's Information Sciences Institute came in second with a 0.4657 on Arabic tests and 0.3073 on Chinese. IBM scored 0.4646 on Arabic and 0.2571 on Chinese.
Other participants included the University of Edinburgh and the Harbin Institute of Technology. Most of the software tested came from research labs, according to the National Institute of Science and Technology.
Google likely benefited from its huge store of source material. Generally speaking, translation software improves as more data gets fed to it. Through its search operations, Google has amassed billions of translated Web pages.
Like Yahoo and others, Google is looking toward the developing world for new customers. The company includes some machine translation tools on its site, as well as several international editions.
Google declined to comment. (Google representatives have instituted a policy of not talking with CNET News.com reporters until July 2006 in response to privacy issues raised by a previous story.)
5 commentsJoin the conversation! Add your comment