re-CAPTCHA-ing the word
Google is digitizing the world's books, with the help of website users decoding of smudged or blurry texts in a technology called reCAPTCHA.
This story was originally covered by PRI's Here and Now. For more, listen to the audio above.
In 2010, Google announced that it would scan all books, a number that Google estimates to be about 130 million, by 2020. Scanning these books does not just mean taking a picture of every page; the books will also be converted to a format that is readable by computers. An "optical character recognition" format will allow computers to search for a word throughout the text. This is no easy task, since many older books have smudges on the page or faded ink that can stump computers. Only a person could figure out much of the text.
For books written more than 50 years ago, 30 percent of the text is indecipherable by a computer, according to Dr. Luis Van Ahn of Carnegie Mellon University. The solution to decoding so many books is a technology called reCAPTCHA designed by Van Ahn. Using this technology, 100 million words are deciphered every day with the help of very everyday people.
Here's how:
When someone purchases tickets to the next live taping of her favorite radio program, before payment, she is asked to re-type the garbled word she sees in the box above. This word is a CAPTCHA or a "Completely Automated Public Turing test to tell Computers and Humans Apart." It's a security feature to make sure that the ticket buyer is a person instead of a robo-computer buying up all the tickets for a scalper. Many sites use this tool, but what the buyer doesn't know, is that the word she's decoding may be from a scanned book. The word looks smudged, not because the computer generated a smudged word, but because the word was smudged on the original page.
If the CAPTCHA word is from a book, it's part of Van Ahn's reCAPTCHA program, and everyday people are truly reviving the books of yesterday.
---------------------------------------------------------------------------------
"Here and Now" is an essential midday news magazine for those who want the latest news and expanded conversation on today's hot-button topics: public affairs, foreign policy, science and technology, the arts and more.More "Here and Now".





Post your comment