Google Digital Books Progresses With reCAPTCHA Acquisition

Google Books Recaptcha

Google Inc. (GOOG) has confirmed that they have purchased reCAPTCHA, the online web fraud prevention site. ReCaptcha’s technology is currently being used by over 100,000 websites. Its technology has been used for such things as stopping automated spam on contact forms and new account registrations.

The basic premise behind a captcha (Completely Automated Public Turing Test To Tell Computers and Humans Apart) is to present an online user with a computer generated challenge string of text that consists of characters that are warped and usually on a distorted background. The expectation is that a computer would fail to recognize the characters and subsequently fail the test whereas a real person would be able to recognize the text and pass the challenge.

Google anticipates this new technology will help the search engine giant with scanning books and newspapers so as to make these resources available online.

“The words in many of the CAPTCHAs provided by reCAPTCHA come from scanned archival newspapers and old books,” Google Project Manager Will Cathcart stated. “Computers find it hard to recognize these words because the ink and paper have degraded over time, but by typing them in as a CAPTCHA, crowds teach computers to read the scanned text. In this way, reCAPTCHA’s unique technology improves the process that converts scanned images into plain text, known as Optical Character Recognition (OCR).”

reCAPTCHA’s technology is expected to improve the accuracy of the Google’s book scanning process.

[Source]

Leave a Reply