Thursday, January 18, 2007

ACAPTCHA - Almost Completely Automated Public Turing test to tell Computers and Humans Apart

(Welcome reddit users)

Captcha generally (but not always)solve the problem of comment and other spam. But this comes at a price. Users with low visibility and other disablities find solving captcha hard. And blind users cant solve it unless you provide an alternative audio captcha. Why, even Seth hates it!
Negative captcha - where you hide form fields via CSS so user can't see it and hence not fill it, while bots will, is an interesting possibility. But let me itroduce ACAPTCHA - "Almost Completely Automated Public Turing test to tell Computers and Humans Apart" to you. This is waht you do.

1. There are some questions which are very easy for humans to answer but very difficult for bots to understand. Take "What color is a blue towel?" or "Is a green towel red?". Any (well most) humans can answer that qwestion in a snap, but probably not bot can.
2. Create a centralized AND rapidly changing repository of such questions. May be allow users to submit new questions and answers there. May be peer review questions before accepting them, whatever you do get a large and fast changing repositary.
3. Create a plugin/architecture where you get a random question for the repositary (ala Akismet which is a distributed anti spam engine) and ask users to solve it.
There are already some sites which try to do something similar. They ask question where they ask something like "What is 2 + 2". The problem is, it is probably very easy to break this. As soon as this becomes mainstream, you can be sure that the bots will break trough and abuse. To beat completely automated systems, you need to bring in human intelligence.

Updates -
Foo asked: ". The repo would have to include the *answers* and be as easily downloadable, right? Right. So Mr. Spammer wins again."
And I say: Well no the idea is that the central repository has say a million questions and answers. And whenever any site wants to check using a ACaptcha, they ask for a question-answer pair(Using an API). Now no one excepting the repository has all the questions and each time the spammers get a new question. This is why you need the repository to get new questions quickly, so that spammers can not build up a bank of questions over time and know there answers.

2 comments:

bueno said...

The idea of a Turing Test catchpa is good. The idea of a central repository is bad. The repo would have to include the *answers* and bea easily downloadable, right? Right. So Mr. Spammer wins again.

shabda said...

Well no the idea is that the central repository has say a million questions and answers. And whenever any site wants to check using a ACaptcha, they ask for a question-answer pair(Using an API). Now no one excepting the repository has all the questions and each time the spammers get a new question. This is why you need the repository to get new questions quickly, so that spammers can not build up a bank of questions over time and know there answers.