CAPTCHA and Microsoft Asira

A CAPTCHA is a quick test on a web server, to tell if the person accessing the server is a human or a computer script. They might display a few characters scrambled slightly, and ask you to type them back in. Since a simple script isn't doing image processing, the odds of a script accessing the page (or signing up for an account!) is pretty slim.

The problem is usability. I'd consider myself pretty good with computers, and occasionally I'll hit a site that scrambles the catchpa enough that I have to give it two or three tries. Less computer savvy users are going to have significant problems signing up for an account. You can make the scrambling less aggressive... but that ups the number of scripts that can then slip by.

Microsoft Research seems to have come up with a hugely usable solution, based on the work of several other groups. Take an image, base a "yes or no" question on it, and have them cycle through ten pictures quickly. That gives a script a 2^10 (1 in 1024) chance of guessing them all correctly, but if the pictures are simple enough, that gives the average user an almost certain chance of getting it right. Other groups had done this with relatively small pools of pictures, where a script could then learn which images should get which answer. Microsoft brought in a business partner and used three million known images, asking the easy to answer question:

"Is this a dog or a cat?"

(And for the record, CAPTCHA is a Completely Automated Public Turing-test to tell Computers and Humans Apart. The acronym feels like a stretch, so it goes down here at the bottom.)

1 comment:

Anonymous said...

FWIW, captcha has also been broken (other references exist as well), making it no longer a viable option. We've seen adversaries take advantage of this over the past few months.

Ironic that I now must type in a captcha match to submit this blog posting :-)