Attempting to take the upper hand in the battle against bots, researchers from Google have devised a new CAPTCHA system that uses a series of randomly rotated images to distinguish between human visitors and automated scripts. The technique, detailed in a paper titled What's Up CAPTCHA? (PDF), presents people signing up for site …
Seems like a spin on the Neopets captcha system, geared toward those with limited literacy, which has been around for quite some time. As I remember it consisting of a sizable image including a fuzzed out neopets character, among many other items. The user is expected to click on a certain character.
doesn't look too hard
perform systematic rotations, check with a 2d minimum edit distance going hotter/colder towards the true orientation. Probably lots of optimisations possible by looking at edges + large colour areas + heuristics
Not my area actually, but seems doable. min. edit distance calculation might be a bit expensive I suppose.
should have read the article properly. Thought it was orientation matching not which way up should it be.
correct answer for B
could be either, assuming that's a Norwegian Blue pictured on the left-hand side...
I actually would prefer this next to the current captchas which are getting so bad now that I can't even tell what they are. Forgot where I saw one that the O's and 0's looked identical and the Z 2 were hard to tell apart along with the 5 and S, and I 1 l, seeing they were rotated in random directions with strikeouts going through them at random angles.
Services should not be so easy to sign-up anyway..
All this CAPTCHA thing is useless in the first place. Online services (of any kind) must be elitist, and admittance should be a gruelling task. It should involve questions, essays and at least an interview by phone. Three members of good standing should vouch for the entrant-to-be. There should be levels of access according to merit AND seniority.
Access should be similar to applying for a job. This "for the masses" fad should be held off, until the LCD is "literate" enough, can and does RTFM, can conduct a properly focused search on google AND able to work on from there, etc, etc. "Leveling up" -as it were- should be subject to a skill test.
Somebody should start a church of technological priesthood one of these days - no mysticism, but with all the fun nomenclature accumulated over the dark ages. I'd do it if only I weren't too busy these days.
At a web site that I help manage, image-based CAPTCHA was recently implemented to replace text-based CAPTCHA which had been cracked by spam bots for a long time.
Spammer registrations dropped immediately from 100+ per day to around 1 per day, and the remaining small number of spammers are easily recognisable as human by their behaviour (and they're usually from India).
Any reasonable image-based CAPTCHA is much more effective than text-based. It doesn't have to be the world's best, as long as it's reasonable.
Personally, I blame Microsoft - all the spam bots are Windows boxes (the OS fingerprint is detected by the site).
statistics seems to suggests this is not sensible.
Two images of which one has the correct orientation: 50% chance a spambot will guess correctly every time. A captcha with randomly pitched characters of text, 100/n% chance a spambot will guess correctly.
How is relying on a task that humans can do going to improve things when the odds of getting round it is statistically fixed at 50% because of the task choice?
@Anonymous Coward - "Personally, I blame Microsoft - all the spam bots are Windows boxes (the OS fingerprint is detected by the site)."
What has that to do with the OS? It is only natural for you to have such an observation due to the market share of Windows.
For small sites (i.e. small targets), the biggest security benefit comes from doing something different to everyone else, as most attacks are automated based on target identification by search engine.
This isn't an option for Google.
Where are they getting their source images?
If it's Google image search we can expect random porn to be hitting kids computers any time now.
This does absolutely nothing to stop the armies of lowly paid minions, if anything it makes it easier. Surely if we are making the assumption that the people creating these accounts are, as the Daily Mail would have it, Johnny Foreigner, then surely we should take advantage of this fact when protecting our western imperialist web sites. Maybe have a big button next to the submit button that says "Free Swan Recipes", "Mail order British Daughters" or "Claim Asylum" in the hope that it will lure the scary foreigners into revealing themselves.
Last time I looked this wasn't there anymore, but it was IMVHO an excellent test...
@AC (Services should not be so easy...)
I love the idea of internet levelling.... can we get character classes as well?
"You need to be PC nerd - level 6 to view this site, you are nintendo fanboy - level 3"
answer for "C"
erm, which way up does this circle go?
Image v Question
> At a web site that I help manage, image-based CAPTCHA was recently implemented
> to replace text-based CAPTCHA which had been cracked by spam bots for a long time.
I used to get about 5 porn-spam-bots a day on my phpBB forums once they had figured out the built in image CAPTCHA system. I switched to a simple question based system which asks a random question like "What colour is the sky?" and I've not had one bot sign up for the last two years. Sure, it's not a multi-lingual system but if that's not a problem for you it really seems to work.
Yet another system easy for AOLers to decipher. Is there no getting rid of them? ;-)
One major problem...
Your going to fall foul of diability legislation. How the hell is a blind person supposed to tell which image is the correct way up?
Also: What is picture C supposed to be? O_o
Unless im being incredibly thick.
..C is a bit tricky?
Need to be smarter than captcha
Given that Indian/Russian sweatshop workers will undertake human-powered captcha bypassing for a few dollars a day, what's necessary is a raft of tests assigning a score for each, like the better spam filters do, a sign-up failing to achieve the necessary score gets dropped.
I use a set of 12 tests (currently) like this on form-to-mail scripts with great effect.
Regularly changing the algorithms can help too. I switched a relatively unprotected page that was getting abused to use heavyweight validation. Those that marginally fail validation get a "not sent, looks like spam, try again" type of message - but those that fail badly get returned to home page - just like a poor form to mail implementation might do (those that get it right see a "message sent" confirmation). Because I use JS validation too it is very unlikely that genuine users will get black-holed.
The result is that the spammers don't know they failed so they stop trying to get around the validation, just keep pumping the messages through unaware that they are all getting black-holed. OK, my servers do have to deal with validating and dumping that traffic but at least the spam's not getting through and the spammers stop probing for weaknesses.
Of course where the abusers are signing up to, say, a forum it's harder/impossible to fool them to that extent.
A small credit card payment to sign-up would pass a share of the validation problem to the Banks - and encourage the Banks to tighten their card validation systems! (and at least earn some cash!).
I'd happily pay a dollar for a gmail account - in fact rather more... best not tell them though!
I guess we'll never get to a foolproof system but we can increase the cost of bypass then at least those with leading-edge validation can expect the spammers to go for easier targets in preference.
Text based CSS captchas
We recently implemented Captchas for the first time on our site and went for the CSS route. Basically we had to have something very usable (too many people struggled with the wierd images as others here have said), but also secure. After searching we found a CSS captcha idea. Take a string, turn it into a picture stream and then render as inline CSS - using a real letter as a sort of pixel and building the captcha up that way. The onlyway to read it us to render it in a browser. No doubt it's not perfect - but it works well. When discussing with an interface audit company they went from the 'That's a pointless captcha' comment when looking at it for the first time to 'That's an amazing captcha' when it was explained.
Simple, fast and no hinderence to users - why both with images?
@Teoh Han Hui
Out of the countless thousands of spam bots detected by my site over a period of years, NOT ONE has been a Mac or Linux box. Obviously, this proportion is far, far less than market share of desktops/laptops by those OSs.
If you want to turn this into a Windows OS vs. the rest argument, go right ahead ... but be prepared to lose.
@ AC 09:59
That's a brilliant idea.
If you could make it not render properly in IE, I think I would like to have your babies.
I seem to remember a similar idea from a few years ago where you had to determine where the kitten was in a series of pictures of puppies.
Captcha needs to be replaced
The Captcha or other system images I have encountered on some sites are so distorted as to make them unreadable, the randomly jumbled letters are ok, its when they run the letters together and warp the baseline.
Paris cos she needs glasses for Captchas too.
Bring back the lens-lok!
Stop wasting time on this CAPTCHA business, simply bring back the lens-lok system so beloved by 8bit owners!! That would stop 'em! You'd be lucky if it worked right 2 out of 10 times!
@ Random Noise
"I seem to remember a similar idea from a few years ago where you had to determine where the kitten was in a series of pictures of puppies."
But did the pictures have suitable captions about cheeseburgers though?
Edward Miles is right, there would have to be some innovative accessability option to get round that. The current text captchas are problematic enough for people with language processing skills such as dyslexia, especially as by their very nature assistive software cannot 'read' them aloud. Some of the 'assistive' methods that read the captcha aloud are also 'distorted' and studies by the W3C have found that even then computer voice recognition can understand them far better than humans. Even logic tests can be problematic for people with cognitive or learning disabilities.
It is a big problem, especially as any companies based in the UK need to show they have made all reasonable adjustments to be accessible to people with disabilities.
OK it's not a spam botnet but still, it's the best I could find quickly. As a (presumably) IT pro, you should know better than to discount the possibility of being compromised based merely on the fact that it's less likely. Just because very few people bother compromising Linux/Macs (presumably it's because there's just not enough of them about) doesn't mean that sooner or later someone won't attempt to do it on a large scale, so you should always be vigilant regardless.
But then I don't need to tell you that do I?
@statistics seems to suggests this is not sensible.
but users get pretty fed up being presented with increasingly large-n character based captchas, which as another commenter has pointed out are sometimes so skewed that they're impossible to determine!
A single image however is far easier (almost instantaneous) to determine its true orientation, and far harder (though not impossible) for a computer to apply scripts to.
I'm well up on probabilities of compromise, and likely methods of doing so, etc. I usually spend my time telling people to remain vigilant, even if they're not running Windows. We could spend all night discussing methods of compromise, cause-effect relationships, impact of market share and so on - the infection rates should speak for themselves, though.
Back on topic: we were talking about CAPTCHA, and it was only when shit for brains up there spewed out the Windows fan boi line that I bit. The fact remains that out of many thousands of spam bots that have tried to get into my site, there has yet to be a Mac/Linux one.
Oops - I hit "Post" too soon. I also meant to add that that's the first reported Mac bot infestation and the user had to manually download and install trojaned warez to get it - so that's social engineering, not as the result of a vulnerability.
So anyway, back to CAPTCHA .....
Amazing how few people read the PDF
Yet still feel the need to comment...
It's not a 50/50 chance. The CAPTCHA used gives you a circular image and a slider bar, and requires you to rotate the image right-side up. 3 images were discussed as perhaps being the best number to use.
The picture displayed is not an example of the CAPTCHA. It's an example of 3 types of images the CAPTCHA might use. A is a poor choice- Machines can correctly orient it. B. is a good choice- Humans can, Machines cannot. C. is a poor choice- Humans can't.
The one with the cute cats and rabbits is called Kittenauth
Right On, Commander!
So where will they get the images from?
I imagine there's two sources, both with weaknesses;
a) Manufacturer of CAPTCHA will provide images; this will mean a finite number of images, and thus easy for a computer to overcome once it starts recognising duplicates (the flaw in the kitten-CAPTCHA system)
b) Images taken at random from the internet, eg. Google; this will mean a computer can look up matching images on the internet to solve problem.
Wasted effort if the sites are poorly coded.
A major UK supermarket uses a captcha to protect their current series of lotteries from auto entry of guessed voucher codes. Shame that if you use the back button you get the same captcha image every time but a fresh chance to enter a code.... I assume someone could build a bot based on this. I hope that they've got other protection in place to prevent fraud.
I'd prefer a dongle
And Paris would too....although she and I are thinking of different kinds.
re: need to be smarter than captcha
A few thoughts:
Rather than just checking for a valid email address format, verify that the domain exists. For example, 'email@example.com' shouldn't be considered an acceptable address.
Silent rejection is a nice mechanism. Similar trickery is possible for forums. Create special 'spammer' accounts that act like regular ones, except that anything they post is immediately discarded. Bots will never catch on. Making the post visible only to the spammer for an hour or so will fool humans as well.
Semantic knowledge tests are probably the best kind of captcha. Even good AI programs can be stumped by questions such as "what is the name of this site?" or "what day of the month is it?". The sweatshop spammers aren't using zombie PCs, so they can be hit with IP bans. A shared database (like what Spamhaus does with email) of banned IPs could be useful.
Kittenauth is the way to go. You can replace the images with whatever you want and have matching text, so if they feed the captcha to a Russian sweat shop, they would all need to know phrases like "select all the busses", "which ones can fly?" and "find the ladyboys"
- Review This is why we CAN have nice things: Samsung Galaxy Alpha
- Ex-Soviet engines fingered after Antares ROCKET launch BLAST
- Hate the BlackBerry Z10 and Passport? How about this dusty old flashback instead?
- NASA: Spacecraft crash site FOUND ON MOON RIM
- NATO declares WAR on Google Glass, mounts attack alongside MPAA