January 7, 2008 8:09 AM PST
Google applies for image text patent
- Related Stories
Google backs character-recognition researchApril 11, 2007
Grant funds open-source challenge to Google libraryDecember 20, 2006
Google comes to HP's aidSeptember 5, 2006
Will search keep Google on the throne?May 10, 2006
- Related Blogs
MADCAT to take over translation for troops
January 5, 2008
Get docs from pics with Qipit
August 31, 2007
ReCaptcha: The smartest way to deal with something annoying
May 24, 2007
Jacked launching Netvibes-like platform for live TV
August 30, 2007
The application, made in June 2007, was published on Thursday and covers "methods, systems and apparatus including computer program products for using extracted image text," according to Google.
"In one implementation, a computer-implemented method is provided," reads the abstract for the application. "The method includes receiving an input of one or more image-search terms and identifying keywords from the received one or more image search terms. The method also includes searching a collection of keywords including keywords extracted from image text, retrieving an image associated with extracted image text corresponding to one or more of the image-search terms, and presenting the image."
Google, which is already the proprietor of not only the most widely used image search facility on the Internet but also the leading video site, YouTube, has much to gain from being able to correctly interpret text held within images and video. Such a capability could, for example, be used to create more accurate keywords or for the automatic tagging of files and the identification of where a picture was taken based on signage in the background.
However, on Monday a company spokesperson offered Google's standard reply to questions regarding patent applications. "We file patent applications on a variety of ideas that our employees come up with," said the spokesperson. "Some of those ideas later mature into real products or services, some don't. Prospective product announcements should not necessarily be inferred from our patent applications."
The patent application is not the first time Google has delved into the world of optical character recognition, a technology currently used mostly for scanning documents into word-processor-friendly formats.
In September 2006 the company helped debug an old OCR engine called Tesseract--originally developed by Hewlett-Packard--and released it as open source. At the time, Google also quietly mentioned that it was eager to hire "top-notch OCR engineers."
David Meyer of ZDNet UK reported from London.