• On TechRepublic: 10 cool USB flash drive tricks

Webware

Read all 'ReCaptcha' posts in Webware
September 16, 2009 10:26 AM PDT

Google acquires ReCaptcha as book-scanning aid

by Tom Krazit
  • 6 comments

With the ReCaptcha acquisition, Google can improve security on its sites and make its book-scanning project smarter.

(Credit: Google)

Google has acquired ReCaptcha, one of those companies behind the distorted text boxes at the bottom of many Web site sign-in pages.

Terms of the deal were not disclosed, but Google plans to use ReCaptcha's technology both as a security measure within certain Google sites and to make its massive book-scanning project a little smarter, the company said in a blog post. ReCaptcha is an offshoot of Carnegie Mellon University's School of Computer Science, and puts a twist on the traditional captcha: a string of letters in squiggly text meant to confuse spam bots and other nonhuman Web pests.

The idea behind a captcha is to confuse a computer, but computers are also confused by some words written in fonts used long ago. ReCaptcha offers two words, one of which is a captcha it already knows, and one of which is a word it doesn't know. The thinking is that if you get the first word right, you're likely a human and you're also probably going to get the second one right.

It can then pool all the answers for the second word and declare with a reasonable amount of certainty that the second word is what most people think it is, thereby updating the vocabulary of participating book scanners. This is of obvious interest to Google, currently bent on scanning as many books as it can find.

Originally posted at Relevant Results
May 24, 2007 12:02 PM PDT

ReCaptcha: The smartest way to deal with something annoying

by Josh Lowensohn
  • 2 comments

Spam, zombie robots, and the rest of the dark underbelly of the Internet has led to one of the Web's big annoyances: the captcha. That's the barely readable block of random letters you must translate in order to prove your humanness, and it's supposedly the one thing that separates us from the machines. It's also used in nearly every site registration process--and more recently at site logins. The bottom line is that it's annoying but also utterly necessary to keep evil at bay.

Enter reCAPTCHA, a project of the School of Computer Science at Carnegie Mellon University. A mix between disease-curing Folding@Home, and MyCroft [review], reCAPTCHA requires users to solve two jumbled words: one is the actual captcha, the other is just a word that needs to be translated into text. These words come from various scanned books and documents residing on the Internet Archive. Many of those books were written before computers and in their current state (PDFs and image files) are just glorified photographs--a medium that is still hard to sort through. Once complete, they'll be digital text, and completely searchable.

Words for translation are not just chosen by random. Documents that have been scanned, get checked by an Optical Character Recognition (OCR) engine, which is able to pick up many of the words. Those that are misspelled by OCR, or are impossible to read, are plucked and put into the ReCaptcha word pool. Sites can implement ReCaptcha several ways. There are plug-ins for WordPress, MediaWiki, phpBB, and PHP.

I've embedded a sample ReCaptcha below. You'll notice both words look similar, as ReCaptcha is using both words from the same source, so you can't tell which one has already been solved.

Related: inChorus [review]

[found on del.icio.us]

... Read More
  • prev
  • 1
  • next
advertisement

About Webware

Say No to boxed software! The future of applications is online delivery and access. Software is passé. Webware is the new way to get things done.

Add this feed to your online news reader

Webware topics

Google's mobile hopes go beyond Nexus One

The world may have thrilled to the potential for a Google Phone, but what Google actually unveiled is its plan for a new smartphone world order.
• Photos: Unboxing Nexus One

Using your smartphone safely

faq Worms, Trojans, and SMS attacks are risks for mobile phones, but the biggest practical threat to users is losing the device.

Most Discussed

Inside CNET News

Scroll Left Scroll Right