CAPTCHA is Dead, Long Live CAPTCHA!

Has anyone seen efforts for captchas that reveal the letters in an animation? Something that is easy to solve by human eye looking at the letter revealing/morphing animation but really hard for OCR technique to solve since there are too many frames to tie together to make sense of the word.

The “Find the doggie” and “select the word” are instant failures: When there is 10 alternatives to choose from, even the dumbest (meaning: random) bot gets 10% success rate.

So, all list-based alternatives are useless. (Of course, you can link 2-n tasks with Y alternatives, making the propability of guess to 1/Y^n - which is still pretty bad.

ASCII art is just another failure. Convert the code area to PNG, and it is your regular OCR again.

And the words failed on regular OCR… well, they were too hard for me too.

My answer?
Registration to any web service costs $1, via one or another single sign-on service. Half of the money goes to the web site that got the registration, another goes to the sign-on service provider. (And if you manage to make your first million with this, please remember me. :slight_smile:

Hehe:

http://www.ubersite.com/m/113411

Is this the future (possibly NSFW due to two swear words)

There are already human farms, alright. Some involve unwitting users solving CAPTCHAs for access to porn, and others involve low-paid workers overseas solving CAPTCHAs for money a la the “gold farming” model.

Here are observed cases of “CAPTCHAs for porn”:

http://www.linuxworld.com/community/?q=node/2400
http://www.theregister.co.uk/2007/10/31/captcha-busting_trojan/

And there are some cases of CAPTCHA farming:

http://ha.ckers.org/blog/20070427/solving-captchas-for-cash/ (be sure to read the comments for several farmers offering their services)

By the way, here is the source for all these recent “Google CAPTCHA broken” stories – one Websense blog post:

http://www.websense.com/securitylabs/blog/blog.php?BlogID=174

To be honest I suspect this is blown out of proportion. It looks a lot like another CAPTCHA-solving farm behind a web service API. (Observe the timestamps in the logs – 30 seconds to decode a CAPTCHA sounds like a human, not an algorithm, if you ask me.)

Another solution : stop using these damn registration pages and use
OpenID.

Hint: this problem is not a nail. Your hammer is of no use.

Make the captcha too hard, and you’ll lock out many of your human readers. Life’s too short to spend it straining my eyes at distorted text.

Captcha: apple? banana? grape? peach? I KNOW! it’s cherry, right? No?
kiwi? guava? strawberry? secret? password? Oh please, just post my comment already.

Watermelon? Pineapple? Persimmon? Sweet potato? Lime? Lemon? Tangerine? Pomegranate? Olive? Nectarine? Pumpkin? Cantelope?

What about developing a system that uses VOIP to call a number and give the code, as to not alienate users without cell phones. You could also sell add space on the calls to make it generate some money.

It is probably like trying to kill a house fly with a bazoka, and not totaly fool proof, but atleast it makes some money too.

Next step in defense against spammers is probably using an external ID authentification (google, passport, or openID).
Next spammers step is therefore id theft.

ISP are very eager to fight a grandma that download an illegal song, they seems not very interested in fighting spammers.

The only solution would be to apply ARIN/RIPE policy strictly, but it would kill business since most firms are not very carefull about where their business mail comes from…

I run a website/forum for a World of Warcraft guild that I’m in. We used to get a lot of forum spam. What worked for us was to have add a question to the registration form - a trivia question. In our case I asked a question that anyone who has leveled a character to 70 would know the answer to, but noone in a captcha-breaking sweatshop would be able to answer. A lot of topic-specific websites could use similar techniques to filter out spambots. You just need to tailor the questions for your audience.

After making this change we haven’t seen any spam posts.

In contexts where people come together around a specific interest, you have a better point of cleavage – not between people and machines, but between members of the in-group and everyone else, including people who wouldn’t be interested in what you are about as well as computers. As an example, an associate of mine has left a phpBB installation with just such a captcha replacement out on the 'net for almost part a year now, and despite it being at the default location in the domain, no spam sign-ups have been recorded.

If you’re one of “us” for the purposes for which this was written, then signing up here

http://www.obsessivemathsfreak.org/phpbb/

should be trivial. Without some significant AI this isn’t going to admit a bot, and if you just play the captcha out of context in an unwitting mechanical turk attack (e.g. as part of a porn site login), you’re not going to get very many false positives.

Forums / Blogs,

Seriously, any form of validation that requires a user to enter anything but their blog comment or forum message is useless. It may be partially effective against automated means, but a human farm of people can break any of these ideas EXCEPT bayesian filtering.

Once you train your bayesian filter by marking actual spam as spam, and good posts as good then only a very small percentage of spam make it. The ones that do make it you mark as spam manually which further ‘trains’ your filter. Simple.

Email providers,
bayesian won’t help you prevent people from creating spam sending accounts.

Quote
http://www.obsessivemathsfreak.org/phpbb/

should be trivial. Without some significant AI this isn’t going to admit a bot, and if you just play the captcha out of context in an unwitting mechanical turk attack (e.g. as part of a porn site login), you’re not going to get very many false positives.
/Quote

2 Problems:

  1. Too “good”, beat me too. (I know the movies; don’t remember the names)
  2. There seems to be a list of maybe 7 or 8 answers? If this was interesting server (ie. one promising for example send spam mail), making dumb bot with just one correct answer would yield 12% success rate.

Pre-defined lists are not an answer for the issue.

If you’re one of “us” for the purposes for which this was written, then signing up here

http://www.obsessivemathsfreak.org/phpbb/

should be trivial. Without some significant AI this isn’t going to admit a bot, and if you just play the captcha out of context in an unwitting mechanical turk attack (e.g. as part of a porn site login), you’re not going to get very many false positives.

Yeah, right. I wasn’t able to answer a single one out of ten. :smiley: Obviously, I don’t to belong to the targeted audience.

@KG

No, desktop SMTP is slowly being replaced by web based SMTP clients. SMTP is still there.

How about instead of testing if a human is filling out the form, just make sure to map an account to something pretty unique

SUCH AS A CELLPHONE.

Problem solved :slight_smile:

Thanks for the heads up regarding Asirra (the “click on all of the cat pictures” CAPTCHA). It’s definitely way less tedious (and more fun) than the standard text CAPTCHAs…

I’ve had good luck using form-morphing techniques to prevent spam: http://nedbatchelder.com/text/stopbots.html. It won’t stop a human, but what will?

Considering most internet sites are niche sites (company sites, tailoring to some kind of group etc), you should always write your auth according to that group. If you have a webmaster forum, just ask questions webmasters should be able to answer, if you have a Tattoo forum, as them about that kind of thing. This can be made more difficult but warping the text of the question differently every time, so they have a 1 in 5 success rate with ocr and have to know the answer and use pictures.

As a previous poster already said; if I run a forum/site about X, I want people with a brain to comment on things, not a moron, so I don’t care about people who ‘fail’ the test. They cannot join. Mala Suerte.

Another method is to save up the comments/forum posts by new members and auto-checking them against a bunch of heuristics; I have quite a bit of success with that; I simply grep out all http addresses in posts (using heuristics to ‘fix’ urls that are broken up etc) and submit them to google. If I find too much of them on unrelated forums (you can use google queries to do that) I will auto-flag the post and send me a message. Spammers have a goal with their spamming and they don’t, currently, have infinite resources to prevent me from finding it and blocking it. I have a very high succes ratio using this technique.

Basically my (long winded) point is that you should tailor your protection to the site you are protecting and you won’t have much spamming problems.

I believe CAPTCHA has been broken and if you beleive this post:

http://www.mperfect.net/aiCaptcha/

It has been broken for a very long time. With a little time and effort you could recognize any letter that has been distorted, especially if you analyze the average pattern of the letter.

Even if it hasn’t been broken with services like the Mechanical Turk it makes it much harder to determine a human that has good intentions verse one that has bad intension.

http://www.mturk.com/mturk/welcome

And the CAPTCHA definitely isn’t going away. I was just asked to create one for the ASP.NET MVC Framework for a project that I am working on.

http://www.coderjournal.com/2008/03/aspnet-mvc-captcha/

Solutions are only going to get harder and harder. One growing method that I have seen to prevent bots, is by exploiting their weakness when it comes to JavaScript. Basically you add a AJAX authentication string to the POST and the authentication string is only grabbed from the server moments before it is submitted. But that doesn’t really solve the problem because if AJAX can get it so can a bot.

It is a no win situation with current stateless web.

Sorry about the double submit there was a hickup in the form that said the permission was denied for copying the HTML file.