April 11, 2007 7:31 AM PDT

Google backs character-recognition research

Google is sponsoring an artificial-intelligence research group's work to develop advanced technologies for character recognition.

The open-source project, called Ocropus, has several goals, including developing a high-level, easy-to-use handwriting recognition system that can convert handwritten documents to computer text, assisting in the creation of electronic libraries, analyzing historical documents and helping vision-impaired people access information. The "ocr" in Ocropus stands for optimal character recognition.

The project is headquartered at the Image Understanding and Pattern Recognition (IUPR) research group at the German Research Center for Artificial Intelligence (DFKI) in Kaiserslautern, Germany. DFKI Professor Thomas Breuel is leading the project.

Breuel made the announcement on Monday through a post on the Google Code blog. In addition to Google's sponsorship, Ocropus is getting funds from several German government agencies and other public and private entities.

The Ocropus team expects the project to last three years, and it will support three Ph.D. students or postdoctoral students. IUPR is basing the software primarily on two research projects: one, a handwriting recognition system developed in the mid-1990s for use by the U.S. Census Bureau; and two, newer layout analysis methods for character recognition.

Other resources include Tesseract, a decades-old engine for optimal character recognition originally developed by Hewlett-Packard Labs and re-released by Google last year as an open-source system.

A preview of the Ocropus system is available on the project's Web site under an Apache license, and the IUPR is soliciting open-source contributions in order to complete a number of goals. These include creating a desktop application for the system, adding third-party tools and adapting Ocropus to a variety of languages. It's currently English-only.

See more CNET content tagged:
handwriting recognition, open source, project, Google Inc., Germany


Join the conversation!
Add your comment
Optimal or Optical?
I have never heard of "Optimal Character Recognition". Maybe that's a new term for this project, but I believe all the old ones are "Optical..."
Posted by mrjam32 (8 comments )
Reply Link Flag
Should be Optical
See &lt;<a class="jive-link-external" href="http://en.wikipedia.org/wiki/Optical_character_recognition" target="_newWindow">http://en.wikipedia.org/wiki/Optical_character_recognition</a>&gt;.
Posted by _Seffer_ (15 comments )
Reply Link Flag

Join the conversation

Add your comment

The posting of advertisements, profanity, or personal attacks is prohibited. Click here to review our Terms of Use.

What's Hot



RSS Feeds

Add headlines from CNET News to your homepage or feedreader.