Defeating Captchas

Interesting blog post.

EDITED TO ADD (12/14): These guys are the best at breaking captchas.

Posted on December 10, 2007 at 1:52 PM • 41 Comments

Comments

Nicholas WeaverDecember 10, 2007 2:33 PM

CAPTCHAs don't work for Ticketmaster's goal in any case, because a CAPTCHA is only worth $.01 to an attacker to solve.

It also means that Blog spam is worth >$.01, because I've seen spam on my blog even with the Blogger CAPTCHA turned on.

http://nweaver.blogspot.com/2007/12/...

GabrielDecember 10, 2007 2:58 PM

What about a captcha that showed you three pics of kitties and one of a puppy, and asked you to click the puppy? Wouldn't that be a lot easier for us humans and a lot harder for a computer?

FPDecember 10, 2007 3:02 PM

Some references to previous discussion on this blog:
http://www.schneier.com/blog/archives/2007/11/...
http://www.schneier.com/blog/archives/2007/10/...

Also, with the right motivation, you could develop software agents that outsource the captchas to low-wage workers in china ("200 captchas per hour for only $5").

Like I said in the thread about the World Series -- one problem is that Ticketmaster has a monopoly, and the normal market laws of setting a ticket price where supply meets demand do not apply. Otherwise there would be no margin for scalpers.

WaldoDecember 10, 2007 3:17 PM

@gabriel

Nope - the bot would get it right 25% of the time, which is probably good enough for a bot (they're patient). If you repeat images the bot might learn and improve, also.

If you increase the number of images to pick from the job becomes hard-for-people, and possibly plays into the bot's patience. (where am i?)

The point of the article is that image scrambling, like cryptography, is hard to self-evaluate - anyone can pose a problem they can't solve.

my real nameDecember 10, 2007 3:20 PM

I can't find the article now, but there was something quite awhile back about free porn sites that would display a captcha pulled from another site. You enter the correct captcha, get your porn, the site owner gets the answer to a captcha that gets used immediately.

AlbatrossDecember 10, 2007 3:41 PM

There are two related problems here - one is that the rapid advance of technology means that there is little time for actual standards to emerge, and very little chance that the overworked, underpaid code-monkey asked to write a captcha is going to be afforded the opportunity to discover and employ the standard.

But that's not the biggest problem. Arguably "captcha coding standard" ought to solve that problem if the coder thinks to Google it (remember that many of these coders are not native English speakers).

No, the biggest problem is that whether its captcha or an encryption method, programming your own solution rather than using an open standard is an invitation to disaster... AND an opportunity for exploitation.

Last year I was helping a Fortune 50 company with PCI compliance when a mainframe coder told me that the system was now compliant because he had instituted 'encryption.' When I investigated, I discovered that he had 'invented' his own encryption method, which was simply an arbitrary scramble of the data fields.

I didn't bother trying to crack it, but I assured him that he needed a standards-based solution, and that his method would not suffice.

It occurred to me afterward that he had a great opportunity there. Assuming he was crooked (which is my job as an infosec consultant) I realized that if he had gotten his 'encryption' method to work he could then have sold its backdoor, secret key, or built-in vulnerability to the highest black-hat bidder.

Likewise with these weak captchas - the original Ticketmaster coders would be in an excellent position to exploit their knowledge of the captcha weaknesses in order to create their own scalping company.

Meanwhile here in Minnesota scalping has recently been made legal. http://tinyurl.com/3dadbr That's a brilliant solution - one way to fight crime is to take the laws off the books, then everything's legal!

MikeDecember 10, 2007 3:44 PM

Google's CAPTCHAs are easy to read because they're phoenetic. They're also hard to crack because they are:
- In color.
- Anti-aliased.
- Using Serif fonts.
- Warped.

I recently received some spam for offshore computer services. Among the typical services they offered was programming and data entry, but I was quite interested to see that they also offer CAPTCHA entering services. It seems, then, that if you make something that can only be read by a human, you can just purchase a room full of humans to do the reading.

(Low-value sites can get away with simple CAPTCHA systems to block spam bots. Here's an example: http://www.shamusyoung.com/twentysidedtale/?...

MikealDecember 10, 2007 4:03 PM

I have a bunch of sites that saw significant drops in SPAM when i integrated reCaptcha.

Previous uses of Captcha were terrible. The hard to break the captcha the more users hated it. The ones the users didn't hate were so easy to break spam just kept coming right through.

reCaptcha has been great tho.

anonDecember 10, 2007 4:04 PM

What is "phoenetic"? You probably meant phonetic, which some are, but not most. The first thing that leapt to my mind though was Phoenician, an extinct language, which captchas aren't.

Steve BarbedWireKissDecember 10, 2007 4:06 PM

What about starting to test the cognitive capabilities of the client? You could have a captcha that displayed something like "3+8+21" and asked for the answer. If the client did not enter "32" then either a> they're not that good at maths, B> they don't know how to use a calculator or c> they're some form of spambot.

It defeats the idea of just replaying the captured sequence back at the web form. It could even be made more difficult by showing the numbers in word form, so "three plus eight plus thirty two". Of course, the spammers would eventually get around these too but they might go a way to slowing them down, at least for the time being.

AnonymousDecember 10, 2007 4:38 PM

@Steve BarbedWireKiss

No offense but that's a poor solution. I actually wrote a script to break that particular type of captcha that a friend of mine wrote (thinking he was clever). He wasn't. Even though he randomly put together strings like "five + 7 minus forty5 multiplied by 1" and the like, simple regular expressions can barrel through all of these in a flash.

Now, combining those with the types of image warping garbel that google uses certainly couldn't hurt, but it seems as though they're doing alright without any math involved.

Captcha's rely on entropy, obscurity, and some vague notion of particular things that come relatively simple to the human mind / eye, and are relatively complex to for a computer. Mathematics is certainly not one of those things. It's what computers were built for.

GabrielDecember 10, 2007 5:38 PM

Waldo:
really, is there software that could differentiate between kitties and puppies, or maybe chickens from ducks? I could see a software that build a database of pictures, but is it really possible for a computer to actually interpret a picture like that? What kind of application uses that presently?

GuillaumeDecember 10, 2007 6:58 PM

Implementation is everything. A.I. comes next.

I once had a 30% accuracy just by running a stock compilation of ocrad, but most of the time, cracking a CAPTCHA has nothing to do with A.I.

Just like a system that boasted AES-256 encryption... with the "secret" key available on the Internet for all to see.

Most CAPTCHA protected applications can be tricked by bootstrapping the process by a human. No need for a Turing farm. Answer a single CAPTCHA yourself, and send the bot on its way using the same session.

Another favorite : put an "hint" in an hidden HTML field. Sometimes the developer isn't even aware it's there, because the hint is buried deep inside a Java Server Face or ASP.Net view state sent to and read back from the client. But it's there, you just need to flip the bit and you're in.

Before you put a CAPTCHA, do the math : How much lost legitimate users this will cost you vs. how much giving your resources away will. Track usage and recalculate every now and then.

sDecember 10, 2007 11:41 PM

What's really interesting is that despite all of the goings-on with CAPTCHAs and Ticketmaster, people are still willing to pay insane amounts of money for scalped tickets. If everyone would just refuse to pay above face value, the ticket scalping market would cease to exist.

http://securology.blogspot.com/2007/10/...

barrowboyDecember 11, 2007 1:52 AM

@s (also relevant to albatross):

"If everyone would just refuse to pay above face value, the ticket scalping market would cease to exist."

True, but irrelevant because the face value is pretty irrelevant. The idiocy lies in setting a "face value" which is far below the market value. The main effect of doing this is to put a lot of money into the hands of middlemen ('scalpers'). Passing laws to make 'scalping' illegal just pushes more money into the hands of fewer scalpers (the ones who are willing to risk breaking the law). I don't understand why so many people seem to think that this is a good idea.

BobDecember 11, 2007 4:23 AM

Using pictures of kittens for CAPTCHAs may be cute but they suffers from a limited number of kitten pictures - pretty soon the attacker will have cataloged all of them.

As for free-market scalping solutions - if tickets are sold at demand prices then the majority of fans will be denied access to seats by a minority of wealthy people prepared to pay $250/ticket. Yes it's free-market, but is it a desirable outcome? Remember that we can choose how things work!

DBHDecember 11, 2007 6:43 AM

@barrowboy: the problem is market distortion, by buying up the supply you force an artificially high price. The provider of the service does not want to charge that much, hence the lower face price. The arbitrage that occurs happens because scalpers drive down supply, not because that is the right market price.

You'll note that clicking the link in the article to the previous article discusses the economics of various breaking methods. Here's the most important element is time. If a buyer is limited to 4 tickets every minute, that is probably sufficient since so many people crash these sites. So farming the captcha out may not be sufficient.

Bill P. GodfreyDecember 11, 2007 6:45 AM

Bob...

I'm denied access to a brand new sports car by a minority of wealthy people prepared to pay the high price.

I guess I'm not clear how tickets and sports cars differ. (Unless you also think that sports cars should be sold the same.)

aikimarkDecember 11, 2007 6:50 AM

What about a self-enforced ethics rule by the online sites that would prevent event tickets from exceeding the scalping maximum for the location of the event.

For instance, North Carolina "prohibits scalping tickets but allows resellers to impose a reasonable service charge up to $ 3"
http://www.cga.ct.gov/2006/rpt/2006-R-0761.htm

bobDecember 11, 2007 7:29 AM

@Gabriel, Bob: Good idea to use photos. I was thinking the same thing (ironically) - limited domain for photos.

However there are whole slews of websites devoted to kitties, puppies, birds and so forth. A captcha service could have people submit their own pictures, or contract with the kitty/puppy/horse/iguana/whatever websites to supply them ongoing streams of photos for a flat fee (discarding animals like Mr Bigglesworth where you cant tell what it is- that would be the equivalent of an unreadable captcha) and it would be like a 1-time pad - never need to repeat a photo. Actually theres all kinds of photos you could use. Cars. Airplanes. Scenery. Clothing. Even pictures of computers.

After a transaction is entered, have a screen come up showing the following photos: a horse, dog, cat, bird, mountain, sweater, Apollo 13, Bruce Schneier or Chuck Norris (with permission; wouldn't want either one of them mad at you), a car and a laptop.

Then assign a "random" 2-digit number to each photo and emboss it into the picture. Humans have to type the number that corresponds to the horse or laptop. A computer would have to guess the subject and de-captcha the number. Worst case a 90% defeat ratio (if it decodes the emboss 100% correctly and guesses 1 of 10 photos), but approaching 99.9% (guessing right photo AND guess a 00-99 number thats on it).

DBHDecember 11, 2007 7:40 AM

Another way for ticketmaster would be to make it like an airplane ticket. You have to show id at the theater, and your name is on the ticket. This way, a ticket purchased would have to have someone in the party with a name submitted at purchase time to gain admittance...resell market goes to zero...

BobDecember 11, 2007 8:59 AM

@Bill P. Godfrey "I'm denied access to a brand new sports car by a minority of wealthy people prepared to pay the high price. I guess I'm not clear how tickets and sports cars differ.":

I'd say tickets and sports cars do differ.

The limiting factor of ticket sales is the number of tickets (assuming the act's not lousy!), so the price could be increased so long as it doesn't shrink the market below the number of tickets. I imagine this isn't done (officially) for PR reasons - it would upset the album-buying fans who can't get tickets.

There's no fixed limit for sports car production so there's an incentive to keep the profits reasonable (access to a greater market). Though the ├╝ber-rich may be willing to pay so much that a loss in sales would be worth it...

bobDecember 11, 2007 9:41 AM

Actually, the really high end sports cars (Ferrari, Aston Martin et al) are production limited. When they produced the Ferrari Enzo, only existing Ferrari customers were allowed to bid on one.

And tickets to a show are not really limited as long as the star is still alive and can keep putting on shows or switch to larger venues.

I just heard that in the recent Led Zeppelin concert 20,000,000 fans were in a lottery to be able to buy 18,000 tickets. Sounds like the price was too low.

Dim LightbulbDecember 11, 2007 10:06 AM

@All 'Free-market' Suckers (the reason the stats lie at 99%-1% from 90%-10% in the last decade)

DBH said it best, I think. I'm no economic mastermind, but it's basic gradeschool stuff here folks:

"The arbitrage that occurs happens because scalpers drive down supply, not because that is the right market price."

That's so true. Now, it could occur (i suppose) that the face value of the tickets could wax and wane depending on immediate supply/demand issues (such as the date of the show and the remaining number of face value tickets), but if some jagoff holds 25% of the tickets to a sold out show, that's not the 'free market' working its magic, that's extortion through the illegal hording of finite supplies.

Seems to me (although this will get a horde of flaming, even from myself, re privacy) that if you keyed each ticket purchased to a single unique person (and required a valid ID for that one name only to collect your ticket), that many of these scalping issues would be solved. It's clearly logical, considering no one is allowed to resell tickets over face value, and that the tickets have a one-to-one relationship with the ticket holder (unless you are gigofat and need three seats, or a conjoined twin).

As for legitimate resale, well that would work out fine, because you'd allow resale of a ticket to someone else, keying it on their name, thereby forfeiting your seat, and you wouldn't be granted another ticket because you've already purchased one. Giving gifts would be easy, because you'd just key the ticket on the name of the person who would be using it (although it might make it quite simple for people to blow their own surprise, much like amazon 'wish-lists'). The only issues I can see with this are privacy related, and annoyances with people who can't make up their mind whether or not they are actually going to go to a show they purchased a ticket for.

This wouldn't completely eradicate scalping, *but* you're making it extremely difficult to horde the tickets, forcing the scalpers to do far more leg-work than profits would (possibly, and most likely usually) justify.

FPDecember 11, 2007 11:01 AM

@Dim Lightbulb: "scalpers drive down supply"

When Donald Trump buys 50% of some company, he drives down the supply of that company's limited stock. But in his case, it's not called scalping but investment. That's part of a free market. Deal with it. Imagine what would happen to the economy if stock was personalized and could only be sold at a fixed face value.

If ticket prices were market driven, there would be no margin for scalpers, and scalping would cease. Sure, there might be "investors" speculating on future demand, but if the original price was truly fair, there'd be fewer people paying more, and the net gain would be zero.

As I've argued before, selling tickets below fair market value is nothing but brilliant marketing. When priced beyond reach, the general public would lose interest in concerts or sports events, and stop pouring billions into those markets. But this way, fans just feel "unlucky" that they didn't get tickets and blame their misfortune on scalpers.

Wicked LadDecember 11, 2007 11:34 AM

I could understand the passionate objections to free-market profiteering if we were talking about the price of, say, kidney dialysis, but we're talking about a *concert* here, folks. When CAPTCHAs are used to protect access to donated livers for transplant, I may get riled up.

AnonymousDecember 11, 2007 11:46 AM

@FP

Seems an absolutely ridiculous metaphor for ticket scalping: the stock market. Sorry to say, but it's a pretty weak association. Trump buying 50% of a company doesn't raise the price of the stock due to its limitation of availability (although perhaps some morons would be willing to pay more money for stock in a company that Trump owns 50% of, but that's beside the point). The price could still fall to nothing at any point.

Another point that corporate stock has no relevance here is that a ticket is a TICKET, a VOUCHER for obtaining access to some event, one ticket = one person. That's it. There are very few legitimate situations demanding more than one ticket per person, and I don't believe those issues are too difficult to deal with, while retaining the ability to prevent abuse.

Think all you want about what the face-value of tickets should be and what should govern them, I really don't care, I just really can't stand people's justification of ticket scalping by saying 'free market' and using ridiculous metaphors to explain it away.

Scalping amounts to extortion, it has nothing to do with investments, and it's more a rape of 'free market' vulnerability than anything else.

Not to mention, people who equate business to artistic performances really make me sick. But that's neither here nor there.

FPDecember 11, 2007 5:14 PM

@Anonymous: "equate business to artistic performances"

If performances were not about business, then the artists could easily increase supply to meet demand. If they can sell the tickets, let Led Zeppelin perform year-round at Madison Square Garden! It works for Cirque de Soleil or Blue Man Group (which are admittedly franchises) or Celine Dion (insert joke about artistic quality). Tickets are plentiful, and there is no scalping. Instead, there is price dumping. As a consumer, you can pay face value or play the market for cheap last-minute seats.

Another example: in the nineties, Walt Disney Pictures limited supplies of their animated movies ("available for a limited time only"). Despite their original fixed retail price, used VHS tapes soon ran in the hundreds of dollars. One tape = one person. According to your logic, this scalping should have been prohibited?

Any day, you can purchase "limited editions" of something, such as an original painting. Much art today is bought with intention of resale. It is scalping to sell a Van Gogh for millions?

bobDecember 13, 2007 10:57 AM

No matter how many people are scalping or how many tickets they buy, NO ONE is paying more for a ticket than they are willing to pay for that ticket. And that determines the value of the seat(s).

If the venue itself held an auction for each seat in the theater they would sell for the same prices they do now (assuming you could still bid right up until the show started), except they would be the ones to keep all the markup, instead of having to split it with scalpers.

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Photo of Bruce Schneier by Per Ervland.

Schneier on Security is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc..