Spammers are using a sophisticated piece of software that can create thousands of Windows Live email addresses by cracking the protections designed to prevent the large-scale creation of fraudulent accounts. According to security firm Websense, the bot is surreptitiously installed on the PCs of end users. It then establishes a …
There is a whole SDK on the (Russian part of the) internet that I have seen, open source OCR software is coupled with random geometric deformation. It is beyond my level of programming, but from what I can tell - there are not so many implementations, and most of the implementations distribute with source code – and since it is known how the code will generate different characters, and develop a pattern of deformation - it was therefore reasonably easy to develop ways that guess the characters with a reasonably good degree of accuracy.
Limit number of accounts from each IP??
Why don't they limit the number of accounts for each IP within a certain timeframe. although this would block proxy's occasionally. I'm sure there isn't generally a legitimate reason that thousands of addresses are created within a few minutes from the same IP.
Another method would be to block the spam before its sent. ie having a spam blocker on the send rather than just the receive.
not just AOL
..... we have to look out for now then.
Use the Internet Luke ..
"The answer is correct as much as 35 per cent of the time. [...] It's also possible the spammers have found a new type of Captcha-cracking software."
How familiar does that sound?
"A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows" [http://it.slashdot.org/article.pl?sid=08/01/30/0037254]
The author called it ("Its discovery comes a few weeks after the release of proof-of-concept code that defeats a similar Captcha used by Yahoo! Mail."), as did the first poster, so why didn't Websense?
Sure it is 'automated'?
One attack vector that has been used is to redirect the image to a user of their own servers.
The spammers host a website usually containing warez or pornographic images, and ask their users to type in a captcha text to get the content. The captcha image however is one they have just grabbed from the whatever services' new account page. When the user types in the captcha, the web server simply passes that on to the spamming bot so it can create the account.
Limiting IPs will not help
"Why don't they limit the number of accounts for each IP?"
Because the spammers are much more sophisticated than that. They "own" so many machines, it is sick. I run a mail server at my job/life. I have seen dictionary attacks that randomly attempt to send to thousands of users, where no IP is used twice. Every single email comes from a unique IP, for that attack.
With their massive network, they can create crippling ddos attacks on those who dare to oppose them. See http://www.securityfocus.com/news/11392
The spammers go up against reverse dns entries, residential IP blacklists, known spammer blacklists (spamhaus, etc), dynamic blacklists (spamcop), known sender "volume spikes" blacklists, rate limiting on the sender's isp side, rate limiting on the recipient side, port 25 blocking on the sender's isp side, greylisting, etc.. and they can still get the spam through.
They might send one or two spams per hour from each machine. But if they send from 100,000 machines, that is 1,200,000 to 2,400,000 spams per day. If they can use those 100,000 machines to open 10,000 live.com accounts per day, they might be able to send 100 spams per account, per day. adding another 1,000,000 spams per day to their tally, multiplied by how many days the account stays open.
I have seen spammer programs that check the rbl lists to see if the IP is listed in the rbls, before sending mail from that "owned" ip. Their side has some very ingenious programs, and some very ingenious programmers. It seems to be modeled after the internet itself, with a distributed control system (like how dns works) and hundreds of thousands of nodes, to distribute the work over.
Come to think of it, I wish that some of the regular software developers would adopt some of their tactics. Wouldn't it be nice, to have a database that has its content and meta data redundantly spread over thousands or hundreds of thousands of systems? With no single point of failure.
Wait a minute... strike that last paragraph. Replace with: I have an idea for a database that has its content and meta data redundantly spread over thousands or hundreds of thousands of systems (patent pending, copyright 2008, pending trademark 2008, pending service mark 2008). And data encrypted too (another patent pending, another copyright 2008, another pending trademark 2008, another pending service mark 2008)
Spam only exists becouse it's profitable.
It's only profitable becouse people are stupid enough to open it. Some people are even stupider and believe it.
It's alot like Renton in trainspotting says about the English
"Some people hate the English, but I don't. They're just wankers. We, on the other hand, are colonized by wankers. We can't even pick a decent culture to be colonized by. We are ruled by effete arseholes. It's a shite state of affairs and all the fresh air in the world will not make any fucking difference. "
Some people hate spammers, I don't. Spammers are just opportunists. We(The internet using masses), on the other hand, are exploited by wankers.
I mean we all bitch and moan about it but at the end of the day the problems only there becouse it's profitable.
I don't see why businesses would be letting in hotmail, yahoo and google anyway, if a business partner is using those things you should think about getting a new partner.
"Their side has some very ingenious programs, and some very ingenious programmers."
And as mentioned in the article they farm the coding out. I once visited one of the freelance sites where a piece of work is offered out to potential bidders. You've might have seem them, their forums are full of Americans complaining that they can't make a living because they are underbid by Asians who offer to do the job for sixpence and suggest that low bids should be disallowed as said Asians couldn't possibly do as good a job as them anyway.
In my perusal of these sites there were a large number of requests for people to code this sort of captcha cracking software. Seems like the low cost Asians have done a pretty good job and proves that they are indeed very ingenious. With such a large potential programmer base it won't be long before other captchas are cracked as well.
Gibson was right in a way
Except instead of the Russian Military creating fancy code it is the spammers! :)
Mine is the one with the Ono Sendai in the pouch built into the back.
"Wouldn't it be nice, to have a database that has its content and meta data redundantly spread over thousands or hundreds of thousands of systems?"
That already exists. It's called the "Internet."
Every Captcha the use can be cracked by some software. But, a captcha isn´t just a random alpha numeric code with fanzy colors and everything. You can use captchas in different ways/arts/etc. For example: the captcha i programmed get a question from the database and "ask" the user this question, i.e. "What color has the sky?" the answer whould be "blue". The String is converted to a lower string und if a space is at the end or start, i cut it of.
The only thing is, there is a limit of questions you can ask, and if the are recorded by the spammers, they can made a simple question/answer function in the bot.
We have the idea to use pictures or create pictures with GD or Imagemagic and ask for the main color, to crack this, you have to learn your bot the color and what hexcode it could be. On the other hand, you have to programm it to your software first.
The idea with a picture could be extremly heavy. If the picture shows a elephant, you can ask "What Animal is on the Picture?" and to prevent that the bot recognize the picture with a hash, you can throw random pixel errors on the picture, that would be the best thing. Humans can see what the picture shows without getting eyecancer of ugly warped text you have to decrypt. In that case, you only need a handfull of pictures, questions and answers, easy for everyone, execpt the one who programm it ;) But if its programmed, it could be the best captcha on earth.
And no, i dont like the book captcha, its stupid and could be cracked.
Any ideas of other captchas, feel free to mail me @ spelter<dot>hof[@]freenet<dot>de (what for the @ in brackets)
Further to the earlier comments, it's also (relatively) easy to spoof an IP address, a tatic used for many years by spammers.
It's fascinating that in 2008 there are sufficient imbeciles still opening email from folk they don't know with stupid email addresses and subject headers, reading the usually idiotic message, then following any links to dodgy websites, then getting their credit cards out to buy penis enlargement pills, viagra or other drugs or stocks/shares in unheard of companies.
Hotmail and Yahoo! considered legitimate?
On the boards I normally frequent, Hotmail and Yahoo! accounts are banned by default precisely because they are sources of spam, and have been for at least 3 years. I got grandfathered in, but good luck trying to create an account on them these days with a free account (even GMail).
Eh? (using that word lately, it is underused.)
Image recognition based on Fourier transforms was developed about 15 years back.
The process was analog (if I remember right) and used some kind of resonate fiber optic cable.
Fast forward 15 years and that analog thing can be emulated digitally.
(or for that matter, just aim the old analog version at your display and run the output back to the computer; why, exactly, did this take so long to figure out?)
The way to stop this...
Is behavior matching and trick questioning.
The first can be achieved by using session IDs and logging the time when certain pages are accessed.
Whenever a registration page is accessed, the browser must have been on a page with a link to the registration page, right?
And certain pages, like the terms of service, how long should it take to pass it, even if a user just skims it looking for the link to the correct page?
Here the user may be asked to 'select your favorite computer', and be given a list of 'Intel Pentium, Commodore PET, Zilog Z80, Atari ST, Sun ultraSparc 5, Asus eee, Lamborghini Dablo'.
Anyone picking 'Intel Pentium', 'Zilog Z80' or 'Lamborghini Diablo' have of course failed the test...
(The list of test subjects must contain related items to make an encyclopedic attack more difficult)
Making the first item a 'fail' is a good idea to stop 'dumb' robots which scours the net searching for forums, filling out a certain set of well-known fields and ignoring all the rest.
"Free email services from Microsoft, Yahoo! and Google are rarely blocked by anti-spam products,"
Well there's your problem then.
It's all about cost. If I mis-use an email account provided by my ISP, I'll probably get rapped on the knuckles and since I've paid good money to my ISP to get the account I might be a bit miffed if they boot me off. If I mis-use a MicroYahoogle account, I may or may not get thrown off but it won't have cost me anything and I can get another one easily enough.
There's effectively no penalty for mis-use of a free facility, so free facilities tend to get mis-used. Ergo, free email providers should be at the top of anyone's spam filter.
sounds like you're in need of 'git' for your source control needs. secure, distributed, and you can treat branches like redundant copies.
take a look at the source code, its just as good at database work as it is source control.
... the black chopper cause if everyone knew how good it is, Oracle would send round some thugs to shut me up.
The Nintendo Wii has a good email system implementation.
When you add an email address to it's address book it emails the address with a confirmation message. You can then approve or deny receiving email from the Wii.
Obviously this implementation is all in the client and to work for computers it would need to be part of the server implementation.
Ironically MS are a victim of their own poor security models. With a bit of luck they'll realise that instead of wasting effort constantly trying to defeat a multi-headed, shifting set of attackers they could focus more on locking down their bloody OS.
I have a new idea
Forget images. Just send a flash-based animation that requires the user to click in a specific are at a specific moment in time. The area to click in will be a color box moving on a white background, and the box goes a different color for the half-second you have to click in it.
The click has to be made with the mouse, and the color should change at every reload.
Click at the wrong time, you're out. Click in the wrong place, you're out.
Neural Nets and AI
It would be very easy to train a Neural Network to recognise fonts in heavy noise situations. I E Captchas. I imagine the unsupervised training set is thousands of captcha images from all the sources they can find, then supervised training can be a mixture of clear fonts and further captchas.
It would also explain why its not always correct, as a human would prob get it right nearer 95% of the time. So my bet is that the 0day folks have gotten a hold of some AI students research into Neural nets and captchas, and are giving it the needed field testing. Or it could be the students themselves making a quick buck off of Viagra :-)
RFP : Reason for Paris : Neural nets are reaching Paris speeds
I block all mail from AOL and MS anyway. Anyone using either of those has nothing to say that I want to hear.
Uh... surprised the success rate is that low, actually
Since most scanner software comes with a method of making text in a document that you've just scanned in editable - or at least the HP stuff does. Simple shape recognition surely...
Wastn't the IC a Chinese military program?
Surely a better Captcha system would be.....
Use the munged text to ask a question requiring data to be extracted from some other munged text.
So the first would be "Type the first/last number of green/red/blue/black capitals/non-capitals in forward/reverse order ignoring/using only the letters in the word as follows ......"
Or "type the number of bold capital M's in the text followed by the number of letter y's in the text" (this is clearly weaker than the one above)
Then you give a larger captcha image with a load of randomly generated letters in order in caps/small and with colours (although colours could be excluded to aid those who are colour blind)
This requires the computer to identify the text in the first captcha, parse it, understand the question, then decode the second captcha and extract the relevant data from it.
This double captcha method would also work for audio systems for the blind. The first question could be something like - "what is the largest animal in the following list?" and the second audio would be a list like "house, airplane, cow, dog, road, balloon" - the computer would then have to identify both the question, the list of answers and then understand which is the correct answer. This task is possible for a computer but is computationally intensive so would make it expensive to break. On the other hand building the audio is fairly easy as you could merely distort stored audio in a way that makes it hard for a computer to recognise!
I don't know if anyone has thought of using systems like this but I don't see them being much harder to create than a normal captcha and they eliminate the need for images or similar.
Anyone got any better ideas than this?
Flash and color for captcha...
Well... May sound nice but being colorblind i'd hate to have to get someone to tell me the color of and area of the picture to be able to access your forums.
As for flash...
Image captcha's are already a problem for accessibility standards, leaving out many visually deficient users. I don't even dare speculate on how many users the flash solution will leave out...
As someone said, it keeps coming because it is profitable. Captchas are a cludge, not a solution. The only way to really stop it is to change the economics. BlueFrog had the right idea, and they were promptly DDoSed to death by the spammers, proof that they were actually making a difference. There are some FOSS projects trying to recreate their system on a p2p distributed model so this can't happen again, but it's slow going.
They probably use software from the same company that supplies CSI/Miami/New York. You know, the stuff that can take the address off the back of a note held in the hand of the crimal who is a 6 pixels by 6 pixels reflection in the sunglasses of the man on the speeding motorbike.
Re: Stefan Spelter
A friend of mine developed a captcha that presents 9 pictures of animals in a 3x3 grid, and asks the user to pick the three kittens. Difficult for a computer to crack, and aw, kittens.
Captchas do have to be easy for the human to solve as well as difficult for computers to crack. They're supposed to be a brief annoyance for the user, not some sort of intelligence test.
My Captcha cracker
uses locr (free Linux OCR util) with a number of ImageMagick commands fronting it to scrub the noise and improve the contrast of the original image. Mind you, the Captchas I'm breaking are quite low tech and don't feature text with significant wobblyness.
Success rate for me is about 80%.
Getting my coat from the locr.
Re: I have a new idea (flash captcha)
You're not understanding the whole concept of captcha's. Though they make the human jump through hoops to proceed, that's not their purpose. Their purpose is to present a problem that, using present levels of technology, can (hopefully) only be solved by a human -- but not so hard a problem that any humans would be "left out" or so arduous that the human is dissuaded from proceeding. And, in order for them to remain effective, they must be able to be dynamically generated; if a human must generate them (such as creating a question / answer pair) then the attackers / spammers can just cache the proper answers in a database. (My definition.)
A flash animation that required a human to click would annoy humans, leave out a significant subset of human users, and (here's the real problem) probably be solvable using current computer technology. All you need is an open source flash engine that you can hack to your purposes. No mouse click is really needed; your pwned flash engine can register a simulated click whenever it wants. The trickiest part is figuring out when and at what coordinates to send the clicks. But since the code that drives this flash captcha would be human generated, it could be human reverse-engineered, and the knowledge coded into the bots with the pwned flash engines. The good guys in this "war" might then turn to code auto-generation / obfuscation, but if I had to bet money on who would win, it wouldn't be on the good guys.
more complex = less users
the trouble with making the captchas more complex to defeat automated systems is that they'll then get more difficult for regular users. what about colourblind people? or those with bad eyesight
and to be honest, if i have to spend 5 minutes trying to decipher a set of barely visible characters just to sign up to a web forum, or get free email, i probably can't be bothered.
the idea of using double captchas where one is a question and the other an answer is a good one, but honestly it sounds like far too much effort for most users to be bothered with.
so who would want to implement a system that is going to turn people away from your site before they've even managed to log in?
Here's the problem...
Users who are likely to not look at e-mail from random providers, surely won't treat hotmail with great regard, and surely won't fall for the usual 419ers?
On the other hand, users who fall for scams are going to fall for them regardless of the email provider.
Howabout I hand out my router IP and you can break into my Vista box since it's so insecure.
Simple stop spam idea
Along with the usual To, CC, BCC and Subject fields why don't they add an extra AntiSpam field. When you sign up with various websites and give away your email address you also give them an AntiSpam word or password. Within a setting with your mail provider you then provide a list of valid entries for the AntiSpam field. Any email without a valid entry goes straight into your spam box. If you start getting spam using one of the words or passwords you have a good idea who passed on or had your details stollen from and can then disallow that entry within your mail box. A spammer can find a valid email address easily but how are they ever going to know whether they used a valid entry in the AntiSpam field? Maybe I'm being short sighted but it sounds simple enough.
I'm shocked I tell you...
Shocked to hear that free email services are rarely blocked. Back around, oh, 1999 I guess, I didn't fully block but applied serious score penalties to hotmail, yahoo, msn, and of course aohell. I mean, back then they were often forgeries, but still. A man ahead of my time I guess.
Well, there'll be no line at the coatroom at least. The one with the big "(A)" on the back please.
It's a goodish idea, but people don't care - an extra word is extra thought and effort. And users don't like extra effort. None of us do.
Anyone who can already be bothered putting effort into things already has different accounts for different things
1 email for rubbish you sign upto - to be honest for that kind of c--p most people will use fake email addresses from mailinator type companies.
1 email for probably insecure junk like forums
1 email for social networking junk
1 email for kind of friends
1 email for kind of important things - ecommerce type stuff, online games you pay for, pizza company, etc.
1 email for important things like bills
1 email for people who are really freinds and family
1 email you don't give anyone often used to obtain email accounts for bills and family/friends
You can get away with 3 I suppose, a master account you give no one, a major account for important things, and a thwor away account.
However normal everyday people are as likely to do that as they are to want to remember yet more bits and pieces. I personally hate the fact that I need 4 seperate numbers for my banking - telephone pin, machine pin, web pin and a web account number. Email though unlike my bank isn't very important and should never be trusted.
Why do we all care so much anyway, it's just junk, delete it and move on. One day people may stop falling for it, but then one day we may engineer flying pigs.
Facebook actually do something similar to this already by adding a +somecode to the sender's address so you can validate it. For example, firstname.lastname@example.org.
Hot or not...
Didn't hot or not have a captcha API?
I'm not sure there's any software out there that can be leveraged to pick the hottest out of four photos?
You'd have the added advantage of effectively banning perverts, too.
Ms Hilton, since she's not.
@ Stefan Spelter
I've been using a text-based captcha that I developed myself for some time now, with good results. It uses a variety of types of question, which rotates on a daily basis.
The only thing is, I've found that writing good captcha questions is an art. You have to be very careful to make them entirely unambiguous, and neither too hard nor too common. I see a lot of simple maths questions used as captchas, so it's only a matter of time before they are routinely broken. Ideally every website should have a unique set of trivia questions that don't follow a set pattern.
Unfortunately this approach scales really badly for large websites, and I can't see it working for any of the major free email providers. They would need tens of thousands of questions and answers to make it work.
@Simple Stop Spam Idea
That's all well and good but when the spammer gets hold of the antispam word you gave your bank/first born/local pub you would stop getting email from your bank/first born/local pub too. Ooops that's not a very good idea now is it? It wouldn't be too hard to get hold of said words given they would have to be unencrypted in the header to be of any use, and anyway what stops the spammer getting the words in the same way as they got the email address or just using a dictionary attack?
Any sort of system like this requires pre-authentication which kinda removes a rather large part of the point of email. It would be like only accepting letters from people you actually know. You could I suppose have a secure anti-spam word based on a encrypted hash of the message, but that still falls foul of the spammer using a social engineering hack to get the keys to the code.
My solution is to have 7+ different email addresses, 3 on free providers, 1 paid for web-mail, 2 domains with multiple aliases set up and that ignore anything not sent to one of said listed aliases, and finally my work address.
One of the free providers is for sign ups to sites I never want to hear from again and that I think are likely to sell the address on, one is for slightly less dodgy sites and Live Messenger, one for Google Talk and as a destination for a number of the aliases. The paid for web-mail is for personal use with friends and as the destination for the remainder of the aliases. The aliases allow me to have email addresses that won't change so are useful for things I want to sign-up to for the long term or where the address might need to be reachable by someone in a couple of years time (I often use them in code I write for people). The work address is surprise-surprise for work related stuff, nothing important goes there since I'd lose it if I left the company - It gets no spam since it is never published or used for signups.
RE: Simple stop spam idea
Something like that is already theoretically possible (though unworkable in reality) with the likes of Gmail -- IIRC they call it "plus addressing". Imagine your email address is email@example.com ... you would sign up at (say) The Register with firstname.lastname@example.org* ... the resulting email will still be delivered to your "yourname" account, but you can then filter based on the "plus" part**
* Caveat #1: I did try to get something like this going on, but I quickly found that the VAST majority of websites (that I attempted this on) insist that "+" is NOT a valid character to include in the email address. If all those eejits would get wi' the program it'd've make my life a helluvalot easier!
** Caveat #2: in order for this to defeat most effectively those dirty spammers, you'd have to sign up with a "plus" suffix every time, and filter your email such that emails without such a suffix are always considered to be spam. Which would also mean your friends would have to remember to include the tag too.
@John F***ing Stepp
"Image recognition based on Fourier transforms was developed about 15 years back.
The process was analog (if I remember right) and used some kind of resonate fiber optic cable."
You don't remember rightly. That was a process for image sharpening and removing transmission noise. It didn't do any kind of recognition whatsoever.
I hope this means an end to those annoying things.
@ Ken Hagan
Gonna have to use that one...
Disposable addresses don't always work
This relies on all your friends, relations and whatnot also being good at security. If *they've* got a Trojan, their address book gets hoovered up and sent to The Bad Guys. Result - "enhance your manhood" adverts to your family mail account.
The best idea on here...
...is the Wii one.
It would be so easy for the free email providers to forbid users from sending email to an address until it has been entered into the address book, and accepted by the recipient as someone they know (via automated email).
This wouldn't require more complex captchas - most of the ones suggested here are either too weak or too complex for the user. Certainly, most of those suggested here would block out all non English-speakers, colourblind people, etc.
Yes, address book authentication is the way forward!
What I get for trying to *RWD.
It was using holographic film; say a picture of the letter A that when held over the letter A would cause recognition. Even if the A in question was in a different font.
(*of course I could be wrong again, now I am trying to remember while hung over.)
Captcha cracking is nothing new....in fact, one of the more successful forum spamming software xrumer, does it very quickly and accurately, and as a matter of course. It's a small step from there to take the cracking algorithms and develop them further for other purposes.
Me thinks it's high time this system was replaced by something that only a human intelligence can work out. It's easy to do, and only the laziness of web developers is to blame, for example can you tell me how in hell a bot could ever figure out an answer to this:
Q- If Billy has 3 apples, and gave John 2 of them, then who won the world cup in 1996?