You’ve all seen CAPTCHAs. Those are those distorted pictures of letters and numbers you sometimes see on web forms. The idea is that it’s hard for computers to identify the characters, but easy for people to do. The goal of CAPTCHAs is to authenticate that there’s a person sitting in front of the computer.

KittenAuth works with images. The system shows you nine pictures of cute little animals, and the person authenticates himself by clicking on the three kittens. A computer clicking at random has only a 1 in 84 chance of guessing correctly.

Of course you could increase the security by adding more images or requiring the person to choose more images. Another worry—which I didn’t see mentioned—is that the computer could brute-force a static database. If there are only a small fixed number of actual kittens, the computer could be told—by a person—that they’re kittens. Then, the computer would know that whenever it sees that image it’s a kitten.

Still, it’s an interesting idea that warrants more research.

Posted on April 10, 2006 at 1:19 PM82 Comments


Andre LePlume April 10, 2006 3:00 PM

If I wasn’t afraid of being sued by the Children’s Television Workshop, I’d create a “three of these things belong together” test :^)

Durable Ally April 10, 2006 3:10 PM

@Juergen: Why not? Call me naive, but the images could be substituted by series of sounds. You could ask the user to click whenever he or she listens to a particular class of sound (like classical music instead of rock.)

Kelly April 10, 2006 3:20 PM

Like CAPTCHAs though, these are still vulnerable to automation attacks in the same way. The way that I’ve heard people break CAPTCHAs are to have a IM bot that spams the CAPTCHA image out to everyone on its buddy list, and then gives the first person to respond with the correct code some free pornographic images or something else that would entice people to respond.

Sencer April 10, 2006 3:30 PM

The following is not a captcha replacement, but it uses a similar idea for authentication:

(Basically you freely click a number of points in an image; and later to authenticate you have to click on those points again. It allows for a varying degree of precision, and allows to add several “fake” additional clicks, so that sholder surfers can’t steal your “passclicks”)

Does anyone know of any research in that method? How would the atributes of that measure have to be, to approximately equal (or surpass) regular password security?

meme April 10, 2006 3:35 PM

pictures of kittens seems like a poor idea in contrast to distorted lettering, the reason being that distorted words can be rendered in any imaginable font and pre-set distortion characteristics. kittens on the other hand, well… good luck. teaching a computer to recognize an image of anything gets pretty easy with a fixed data set… see standard examples on fuzzy logic and neural network image analysis.

Fred Page April 10, 2006 3:39 PM

This would also be marginally better either if one needed to select 4 kittens out of 5 (1 in 126), or if the request was “all” (as opposed to 3), and the number of kittens (1-9) was fairly random.

One could also do a “Keno”-type solution – if you have, say 80 images, 20 of which are kittens, having a user select say, 17 (i.e. not all) would give fairly good assurance that either there was a Kitten-recognizer (like a human) on the other end, or someone had found a successful attack.

Eric April 10, 2006 3:44 PM

Well the problem with this technique is that they’ll likely have a finite set of cat pictures. Before too long, anyone (or robot) visiting the site will have seen them all.

With traditional CAPTCHAs you have an incredibly large number of permuted words, and the distortions themselves were probably governed by real-valued parameters that produced even mroe degrees of freedom.

Perhaps some distortions would be possible with kitten photos, but I suspect that one might not be able to get away with the same degree of variability.

@Kelly, I know that technique was reported, but it may have been theoretical. I don’t think anyone’s found it implemented.

@Sencer, you’re confusing a password scheme with a robot-filtering scheme.

pfig April 10, 2006 4:07 PM

why limit the photos you have to x? get 6 photos off of flickr tagged with animal but not tagged with kitten, get 3 tagged with kitten. adjust the numbers as you please. otp kittenauth.

josh April 10, 2006 4:15 PM

I like it, but my eyesight is bad and the sun is shining in my window. I had a great deal of difficulty with this test until I pulled down the blinds. Had the guy been selling something, I might of moved on to another vendor. Life is tought.

Ian April 10, 2006 4:20 PM

@Fred Page

I’m no statistics expert, but I’m reasonably certain there are only 5 combinations that involve selecting 4 out of 5 pictures, as long as order doesn’t matter. Look at it another way: Pick the 1 pic that ISN’T a kitten.

Fred Page April 10, 2006 4:24 PM


Short answer: It won’t.

The image is too small, there are too few objects that most humans will gravitate towards, there are too few clicks, it scales poorly, and worst of all, the “obfuscation” works for the attacker. Even giving it unreasonable benefits of the doubt, I’m finding the given implementation worse than a 10-character random password where you limit your alphabet to 62 characters (Ex: A-Z,a-z,0-9).

If you want a long answer, post the question on a discussion of passwords.

bpo April 10, 2006 4:26 PM

Since exhaustion of the kittenspace and fast advances in the realm of Kitten Assertion Technology are real worries here, I would suggest a slight modification of the system. Instead of kittens, subjects should be asked to pick out a series of images that are “cute”, allowing for a small margin of error (I would suggest +/- 1 for small sets) for personal taste.

Jeremy April 10, 2006 4:39 PM

What if it tied in to flickr?

I don’t want to go too Web 2.0, but using tags to choose images, flickr would be a pretty effective image database to pull from. Of course, flickr might not like that, and it wouldn’t scale well (zillions of users or whatever.)

Jeremy Dunck April 10, 2006 4:50 PM

@Other Jeremy:
This is one of the millions of project ideas in my list.

Pointers and previous permission:

Use creative commons search to avoid copyright issues.

It was down w/ bugs, but it’s back:

———- Forwarded message ———-
From: Flickr Support
Date: Thu, 10 Mar 2005 16:16:10 -0800 (PST)
Subject: [Flickr Case 5679] Re: CAPTCHA? harvesting images?

Hi jdunck,

If you’re just displaying images from Flickr in another
site, that’s ok, as long as you use only images with CC
licenses that permit this and link every photo back to



Rich April 10, 2006 4:56 PM

I can see a business for a captcha web service. You request a captcha, and get back a question/answer. Behind it would be a large DB of ‘easy’ questions, like “yellow and blue make ?” Being all text, it would be accessible to screen readers.

The difficult part would be a large enough DB of ‘easy’ culture nuetral questions. And of course it wouldn’t be very i18n.

Singsing April 10, 2006 5:00 PM

Another thing, perhaps, would be to warp the image of a kitten around a rotated object.

Perhaps, also, the choice doesn’t need to be just “kitten” or “not kitten.” Depending on the number of tags the site owner has given each image, the user could have to pick all green kittens (green kittens?), kittens with their mouths open, kittens lying down, kittens with other kittens, whatever.

Nicholas weaver April 10, 2006 5:02 PM

Remember, Captchas are first and foremost how “lazy cryptographers do AI”. You want someone to be able to write image recognition programs to solve these hard problems in computer vision.

So you make it a Captcha and let the blackhats take care of it.

COD April 10, 2006 5:03 PM

I installed a math based turing test on one of weblogs this weekend. You have to answer a simple math question (0-9)+(0-9) to authenticate the comment. Hopefully it will keep both spambots and complete idiots out of the comments 😉

It’d be easy enough to beat – but I’m going with the not worth the effort theory for now. Once a million people use this particular WordPress plug in it’ll be a target worth beating.

Urs April 10, 2006 5:07 PM

KittenAuth is not a CAPTCHA. The “P” in CAPTCHA stands for “public”, meaning that an attacker is assumed to have access to the database underlying the CAPTCHA. That’s why PIX randomly distorts pictures before showing them.

Fred Page April 10, 2006 5:07 PM

Why do you think that challenge questions would be harder for a computer to answer correctly than a human?

Filias Cupio April 10, 2006 5:13 PM

Using computer graphics technology, it shouldn’t be hard to generate an unlimited number of different pictures of kittens.

Coda Hale April 10, 2006 5:13 PM

I’ve seen this one making the rounds, and it strikes me as an interesting take on a terrible idea.

First, there are few things which a human can do over the internet which a computer can’t, given sufficient time and resources. There are also few uniquely human tasks which a computer can judge as being genuine or fake. The intersection of these two sets could very well be empty.

Second, any CAPTCHA can be foiled via the grandmaster MITM attack, especially, as a previous poster pointed out, when porn is on the line.

Third, global economics make it feasible to defeat any unbeatable CAPTCHA by using actual humans in foreign sweatshops. (c.f. World of Warcraft farmers in China)

Fourth, visual CAPTCHAs are totally useless to the ~10 million people in the US with visual impairments. Non-sighted users will, given sufficient provocation, sue you for locking them out of your website. (c.f. Target vs. NFB.) Usability-wise, CAPTCHAs are a giant “FU” to anyone with less than perfect vision.

Fifth, even sighted users have a hard time with visual CAPTCHAs. I have better than 20/20 vision and manage a ~90% success rate with the distorted word CAPTCHAs. I have no doubt in my mind that someone, somewhere has an algorithm which could do better.

Sixth, the determination between robot and person isn’t a necessarily useful one. What does it matter than a robot posts to a forum, provided that robot has something interesting to say?

Better to focus on the unwanted behavior (i.e., spamming) than on a fuzzy profile of who we think spammers are and what we think they’re capable of. Text analysis, good interaction design, and collaborative filtering are much stronger solutions than having users play a quick game of kitten whack-a-mole before they can post.

Wow April 10, 2006 5:31 PM

Like “Failed”, I too failed the task a couple of times due to very poorly rendered kittahs. Neat concept though, and I suppose we do know that he/she is a Slashdot reader.


Nick April 10, 2006 6:06 PM

The problem with using flickr is, as they said, you’d have to link back to flickr with each image. What’s to stop a robot simply fetching the flickr pages and extracting the tags?

Unixronin April 10, 2006 7:16 PM

I went to try the test …. and it didn’t even work. A three-by-three grid of “image will load here in a moment” patches loaded, then instantly collapsed to empty rectangular frames about 3 pixels high.

I think either somebody broke something, or somebody got too clever with their Javascript.

Archangel April 10, 2006 10:15 PM

@Durable Ally

Sounds can be waveform mapped easily. The difficulty with AI-proofing an auditory test while keeping it human-performable becomes the relative categories of sounds. Make them too much likeness-based, and the AI wins; make them too abstract in category, and some humans lose. Make the category delta too large, and waveform mapping works again; make the category delta too small, and you confuse legitimate users.

For that matter, computer programs exist for facial profile matching; it isn’t too much of a stretch to think that the face of a kitten or the shape of a kitten could be broken down to essential components and relative positions, and pattern-matched. The challenge just has to be combined with enough motivation towards cracking the system. Increase the user base, and poof, you’ll hit that threshold eventually, because this is a big business.

If you have a database of photos, and the computer placing the db references in pattern knows what pictures are what, you can shift X in “Pick three X” and make it some variable pattern, but then you have to isolate the cgi or whatever code sets up the match game from the actual output to keep it from being bot-readable.

Longwalker April 10, 2006 11:17 PM

Very few human problems can be solved exclusively through technology. Spam really is no exception. Captchas help right now, but only because the blackhats haven’t devoted many resources to defeating captchas yet.

Dealing with spam requires a human solution. A very small handful of ‘people’ (and I use that term lightly) are responsible for 90%+ of all spam on the Internet. Detain or kill them and 90% of the spam problem goes away.

Foxyshadis April 10, 2006 11:25 PM

It probably wouldn’t be too hard to have a web interface, where you define a set of categories and then upload pics that fall into those categories; the software resizes them and if they look good enough keep them. Then like someone suggested, have the user pick a varying number of pics from a varying category. You decrease the odds of random success tremendously doing that.

But again, all this does is bandage the captcha, which is still fatally crippled by using humans as pattern recognizers.

Maybe for the next iterations we can have people work one of those 5×5 sliding puzzles, and have them type in what they think it shows when done. =p

Peter Dowley April 10, 2006 11:51 PM

Another image-based auth mechanism I’ve come across is Passfaces (at This is really aimed to be a password replacement approach though, unlike KittenAuth which is trying to limit automated attacks.

The basic approach of Passfaces is that the user gets assigned some known faces and trains to remember them. The authentication process then has the user getting shown three panels of faces and having to choose their known face in each panel.

Interestingly there is some research behind the Passfaces approach, which showed that a ‘password’ of face images is easier to remember than a normal text-based password. They also looked at face images vs. more general images, and highlighted some of the problems with letting users choose their own image password … turns out that there is strong gender-based clustering on which images are more likely to be selected.

Wim L April 11, 2006 12:28 AM

The MetaFilter thread on this (a day or three ago) had a number of interesting ideas, mixed in with the usual chatter.

Tank April 11, 2006 1:17 AM

I saw one recently where you had a photograph of a London street and you picked a number of features with your mouse.
On return to the site you would authenticate by selecting the same spots on the picture with a variance of 6 pixels.

End effect – I couldn’t be bothered remembering all that crap just for a website registration. Either your website is special enough to be kept that secret or it isn’t. It isn’t.

Student April 11, 2006 1:39 AM

Captchas are little more than a weak stopgap technology that makes it slightly harder for the spammers. The more advanced spammers today regularly run “free??? pron0 sites where each picture is viewed in return for solving a captcha. And human nature makes these sites extremely effective. Just like forcing the other person to solve a computing problem (countered by botnets and distributed computing) this technology is unlikely to lead anywhere in middle-long time. Also, as some have mentioned, the harder the captchas get, the harder they are to solve for people. An interesting fact is that there are a lot of Internet users that are blind, deaf or illiterate (in your language of choice). Have you encountered a Chinese captcha yet?

There is no one solution to spam and bot abuse, and I think authorisation tests will be more efficient than these turing tests. I also think that targeting the source of the spam will be more efficient in the long run.

What really would solve a lot of these problems is using PKI for authorisation, but for some reason I doubt that PKI solutions will get more common.

RonK April 11, 2006 2:24 AM

If you check the CAPTCHA home page at you find that many (most?) of the common text-recognition-based CAPTCHA’s (of the “gimpy” family) have been broken by computer vision researchers.

With regards to text-based CAPTCHA’s which wouldn’t discriminate against people with vision problems: in their Eurocrypt paper, they state that computer programs have a hard time to distinguish a sensible paragraph from a computer generated one. This lead me to the idea of using defunct Wikipedia text as a secret database of sensible paragraphs, but with very little work I discovered that the defunct text is publicly available also.

Still, any large corporation could pay someone to continually refresh their secret database (the resulting test would not be a CAPTCHA, as Urs pointed out above).

Paeniteo April 11, 2006 3:03 AM

I found out that if you keep reloading a particular image, it always stays “kitten” or “non-kitten”.
I don’t know how exactly the system decides when it will display a kitten (i.e. could one decide based on the URL alone or is some session-mechanism involved), but one surely can pull really many kittens out of their DB, once you know whether a particular URL means kitten or non-kitten.

They should improve the system by combining the nine images into a single larger one, with the clicking done via image-maps.

john April 11, 2006 4:35 AM

How exactly would you build a database of kitten images, as determined by human beings, that could be recognised again by the computers?

Because those images are served randomly by URLs like “”, if you notice — so you can’t do it by URL.

Now, if they’re just being served from files, yes, you could build a list of checksums or whatever, but if you served them dynamically from GD or ImageMagick or whatever, with a slight variation in JPG quality or just 3 or four pixels cropped in one of four directions, that would radically increase the problem space, wouldn’t it?

Dan April 11, 2006 4:45 AM

Just because they have lots of pictures of kittens doesn’t mean that they aren’t cyberterrorists

Paeniteo April 11, 2006 5:10 AM

@john: I’m no expert at image-recognition, but I think one can overcome both JPG quality variations (by re-sampling to a common low quality) and cropping borders (by only comparing the centers of the images).
There are sophisticated algorithms to measure similiarities between images.

While the URL may look random, it will always serve the same class of image (kitty / other).
One could reload the particular URL a few thousand times and in the process probably get a good idea what variations they do (if any – it does not seem so) and which images they use as source (not too many, as it appears).

They will have to go some steps further to improve the security of their system, before it is ready for general use.

Parsi April 11, 2006 5:38 AM

This technique targets the bot/human distinction. But, for blog comment sentry purposes, it could also be used to make human/human distinctions. Thus changing the KittenAuth instruction to “Click three pictures of edible animals” would get very different ethnic/regional/religious results over a picture domain that included cats, cows, dogs, horses and pigs.

Political blogs could use a set of pictures of humans and require commenters to identify three “whose views you often agree with”. This would be a barrier to those unfamiliar with the appearance of the relevant people (and who would thus be deemed to lack the sophistication required of a commenter on the blog). And it might even deter some trolls from the other side of the political fence if each comment began automatically with a sentence of the form “I often agree with the views of X, Y and Z” where X, Y, and Z are the names of the the three politicians or pundits whose images the commenter had had to click on in order to be able to post.

Ale April 11, 2006 7:26 AM

Short of a full blown Turing test, there will always be the possibility for false positives and negatives with any CAPTCHA. The point is to increase the computational cost involved in an automated attack, thus driving nuisance bots elsewhere (outrun not the bear, but the other campers). It is a good idea to have a good set of them, so that even the CAPTCHA itself can be varied from transaction to transaction. Our brains are used to this kind of uncertainty, and thus should not be too heavy on the users.

Matthew Skala April 11, 2006 8:01 AM

I object to the very concept of CAPTCHAs because they violate the end-to-end paradigm. You don’t have the right to know whether I am using a robot on the Net or not as long as I don’t make trouble for other users. Many of the best, most important and worthwhile applications of the Net come from adding functionality to existing services in ways unintended by their creators. CAPTCHAs are all about preventing people from doing that.

I don’t think the “Semantic Web” or whatever they’re calling it these days will ever amount to anything, but note that CAPTCHAs are directly at odds with the goals of the Semantic Web.

Bruce Schneier April 11, 2006 8:23 AM

“I object to the very concept of CAPTCHAs because they violate the end-to-end paradigm. You don’t have the right to know whether I am using a robot on the Net or not as long as I don’t make trouble for other users. Many of the best, most important and worthwhile applications of the Net come from adding functionality to existing services in ways unintended by their creators. CAPTCHAs are all about preventing people from doing that.”

That’s a reasonable objection, but it breaks the Internet advertising paradigm. Advertising doesn’t work unless there’s a person at the receiving and — and not a bot.

Bruce Schneier April 11, 2006 8:24 AM

“Just because they have lots of pictures of kittens doesn’t mean that they aren’t cyberterrorists”

No. They would have to have access to an on-line almanac.

Fred Page April 11, 2006 8:32 AM

“Have you encountered a Chinese captcha yet?”

Good point. When I first started reading Japanese, I had a heck of a time trying to distinguish even between classes of characters (Hirogana vs Katakana vs Kanji), never mind identifying the correct character. Even now, I don’t think that I could read any but fairly trival Japanese CATCHPAs. I’d assume that simular problems would occur (in English CATCHPAs) for anyone not used to an Alphabetic langauge.

David April 11, 2006 8:38 AM


How do you see this system in the light of Banks requiring two-factor authentication?

Many Financial institutions are going to implement similar systems (pictures, not just kittens) for the second of the two-factors to meet the FFIEC guidelines.

Is this realistic, or just more BS that will still allow the phishers and other thieves to steal customers money??


snowball April 11, 2006 9:13 AM

It’s also possible to use a collection of PERSONAL photograps — you just have to identify which photographs are YOURS, and which belong to somebody else.

“Photographic Authentication through Untrusted Terminals” — Paper abstract: “Photographic authentication is a technique for logging into untrusted public Internet access terminals. It leverages a person’s ability to recognize personal photographs by asking users to identify their own personal photographs from a set of randomized images. By changing the specific images shown on each login attempt, this technique is resilient to replay attacks, which are when an “overheard” login sequence is replayed verbatim to unscrupulously gain access to a system. A prototype implementation and corresponding user-tests show that not only are participants extremely adept at quickly and accurately recognizing their own photographs, but attackers can’t reliably determine which photographs are “correct” even when given samples of a user’s photographs.”

Sorry, no URL — just search for the paper title…

JakeS April 11, 2006 9:20 AM

“I object to the very concept of CAPTCHAs because they violate the end-to-end paradigm. You don’t have the right to know whether I am using a robot on the Net or not as long as I don’t make trouble for other users.”

No?  Well, this is the Internet, bubele.  There aren’t any rights here, only freedom.  You don’t have the right to use my site.  I’m free to choose to block it to trolls using bots, and you’re free to go elsewhere if you don’t like that.

If you can show me a legitimate reason for using a bot to post on a blog or register for a service, I’ll listen with interest and might even change my mind.

radiantmatrix April 11, 2006 9:42 AM

This isn’t “like a CAPTCHA”, it is a CAPTCHA – a problem that is easy for a human to solve, but difficult for a computer. It’s not even a very good CAPTCHA, for all the reasons Bruce mentions.

And it’s not a new idea to use images for it, either, see PIX:

The above works by showing several images, then asking the user to pick a word from a list that relates to all the items. has other examples of CAPTCHA systems as well.

Matthew Skala April 11, 2006 10:22 AM

“No? Well, this is the Internet, bubele. There aren’t any rights here, only freedom. You don’t have the right to use my site. I’m free to choose to block it to trolls using bots, and you’re free to go elsewhere if you don’t like that.”

If that were the end of it, it’d be fine. I’d still think you were a jerk for using CAPTCHAs, but I’d be free to do what it took to get around them, and it would all even out. The problem is when people think they have the right to put legal force behind a “no robots” requirement – for instance, thinking they have a right to claim damages against users who use scraping software or make “deep links”.

Automated Web log comments aren’t the important application; the more important application is sites like As a holder of Canadian securities, I most definitely do have a right to read the information on that site, and I assert that that includes reading it through an automated program instead of directly in a conventional browser. Nonetheless, they’ve got it all behind CAPTCHA requirements, apparently in order to enforce a database copyright that doesn’t exist under Canadian law.

Rich April 11, 2006 10:45 AM

@Fred Page

On the ability of computers to answer text questions…

I could be wrong. Certainly developing the questions would be difficult. I just think pattern recognition, in particular when the image set is the on the order of 100 is an easier problem than finding the answer to a wide variety of questions. The questions would have to be designed so that they’re not googleable. That is, not looking up facts, but- hm, how’s this for an example: “What is bigger, a dog or a cat”. Yes, I realize that’s just a boolean so it’s brute forcable, but I think that would be a very difficult question for a computer to answer. Yes, you could code it to parse that particular ‘style’ of question, and do lookups for dogs and cats, but what if another question was “when do you need sunglasses, noon or midnight”. Again a boolean, but a google sure isn’t going to help.

Christoph Zurnieden April 11, 2006 11:27 AM

The described implementation is a bad kind of a CAPTCHA, but the idea with the pictures is not bad.
One of the ways to practice memory is to replace the objects to memorize with imaginative pictures and build a story with them. For example:
The password “sI7n#,Q`9A<vY64d)g$xCh 20>@” is a good one, but hard to memorize. If you replace the letters, numbers and signs with pictures, it might be something like “dog, cat, chair, blue, stairway …” and the story might be “a dog and a cat sit on a chair under a blue stairway …”.
All pictures can be publically known, because they are just a 1:1 mapping of the signs which are also publically known. The mapping pictures->signs can be done with a bit of Javascript, so all of the secrecy happens at the client side, the rest is done with a standard procedure like HMAC or alike.
It can degrade gracefully to standard HTTP-Auth for the visually impaired.

This idea seems obvious to me, so there is a good chance that somebody had implemented it already, but I do not know any implementation. It would be nice if somebody has a link or had even done it self and likes to compare the notes.

The only benefit seems to be for the illiterates, e.g. Children or people used to other kinds of alphabets (I for one must admit, that I’m not very familiar with other alphabets then latin and greek).


james April 11, 2006 11:58 AM

@Juergen …

Yet another idea to make sure blind people can’t access a website…

Yes, that’s right, it’s a big conspiracy to keep blind people from accessing web sites. The world revolves around you. You are the center of the universe. Everyone is out to get you …

Oh, wait, no … maybe people are trying to prevent the spammers from accessing their site and posting all sorts of garbage.

Comments like this are the poster-child for why the ADA should NEVER have been passed. Those pushing ADA don’t want equality, they want to be better than everyone else by having everyone cater to their demands.

You are blind, I am fat, we both have to live with it, not make everyone else go to huge efforts to accomodate us.

Carl Witty April 11, 2006 12:02 PM


I decided to try to google for the answer to “when do you need sunglasses, noon or midnight”. I did google searches for:

sunglasses noon


sunglasses midnight

The first search returned about 785,000 hits; the second returned about 1,970,000. Google has spoken: sunglasses are far more closely associated with midnight than with noon.

jmr April 11, 2006 1:12 PM

The problem with many of the approaches described here is the database problem. As soon as the answer to a particular question is known, the answer can be catalogued in a database.

The whole point of a CAPTCHA is to prevent the cataloguing of answers. For the text-answer problems, somehow you would have to have a computer generate new questions automatically, and know the answers to the questions. This seems to be at least as hard a problem as making a computer answer such questions.

Rich April 11, 2006 3:04 PM


We are veering way off topic here, but I just can’t let your anti-ADA comments slide. It is only through sheer laziness that web sites aren’t blind accessible where it makes sense. It doesn’t make sense on google maps. It does make sense on A UC Berkley student worked with Target for a year to get their site accessible, and they finally said it was too much work and refused. So he filed a lawsuit. The very next day they added ‘alt’ tags to all their images, but they still use an image map for their checkout button, so you can tell it’s a checkout button by the alt tag, but you can’t actually click on it witout a mouse.

There are several intersections where I live that have controlled signals. The loop detectors are not marked, so I have to search for them with my bicycle to get them to trip. I don’t want anything special- I just want to be able to go through a green light like anyone else. I paid for that green light with my taxes like everyone else.

Blind people pay taxes like everyone else and should be able to use public services like everyone else.

Yes, Target is a private company, but they sure claim to be inclusive. And Bruce’s blog is private, but it sure is nice that blind persons’ opinions on security, squid and everything related are welcomed.

ok, sorry Bruce, I’ll get off the soapbox now.

Coda Hale April 11, 2006 9:28 PM

James, any web design philosophy which considers utility and accessibility to be mutually exclusive is antique. There is no good reason for websites which alienate users with vision impairments, and many good reasons for not doing so. As a web developer, the choice is easily made, and without recourse to political kneejerk comments. Alt attributes and non-image navigation are common sense at this point, not a burden to be suffered at the behest of blind mafia toughs.

Likewise, there is a wealth of evidence to suggest that visual CAPTCHAs are ineffective against anything but the most dull-witted attacker. So why use one? It doesn’t solve the problem, and there are better solutions available which catch more spam and alienate less users.

If non-sighted users were demanding that people not adopt good solutions, you’d have my sympathy, but the fact is they’re asking people not use bad solutions which also alienate them. Despite your political fuming, this is a complete non-issue.

another_bruce April 12, 2006 11:07 AM

i trained my cat to click on the kittens by rewarding it with ground-up robins for correct choices.

Deapesh Misra April 12, 2006 11:36 AM

I agree with Eric and Urs. The database requirement of a CAPTCHA needs that the database is public and large.

With this Kitten idea, on the developers’ side, there would be a need for a large number of kitten photographs (in varying poses and places) and there would be a need for photographs which are not of a kitten, but are close enough to be confusing. Then somebody has to manually label all these photographs.
After all this effort, if someone got access to this database, the security of whole scheme would collapse!

(There has been a nice idea for the labeling problem: “The ESP game”.)

The other problem with such an idea (which I am also currently facing) is that it is possible for an automated script to randomly guess and be successful. The answer space gets terribly restricted with mouse clicks.
I have been working on this and also have a possible solution in my paper: “Face Recognition CAPTCHAs” ( I am currently working on making it robust to random guessing attacks, which has been rightly called as “no-effort-attacks” by other researchers
in this area.

It seems to me that this boils down to a security vs. usability problem. The more secure it is, the tougher it needs to be, the more usable it is, the less secure it is. But I would love to prove myself wrong on this! 🙂

Oli April 12, 2006 12:05 PM

Hi there everyone. Hi bruce.

My, my.. What a lot of comments my kittens have caused =)

Ok I’ve been working on a better version (which is quite buggy at the moment) which rotates the images. The link is available on my front page.

I’m going to be throwing a lot into this and there are going to be several different trunks for what people want to do.

The aim of this was primarily to be a fun way for people to verify their existence without having to buy a copy of the Rosetta Stone to translate it into real letters. I have taken on a lot of comments so far and as I cant say when I’ll get chance to read all the comments, if you want something to desperately make it into a future version of this, please please make a comment on my page and I’ll try to address it.

Please base your comments (excluding obvious bugs) on the latest PHP version and not the aspnet version running on my site.

I’ve also updated the link given in the page that this article leads to.

If you’re too impatient for that drop me a line at oli–there-should-be-a-@-here–

I’ll be writing a sequal to the original article in a couple of days outlining the specs of a much stronger version that addresses the comments given so far.

And if you’ve got a spare $ in a paypal account that’s rotting away, throw it in my direction and I’ll feed the kittens =)

Oli April 12, 2006 12:08 PM

Should also mention that it plays with the gamma of the images (dynamically).

Bruce, if you’ve got anything to add that others haven’t said, please ping me an email.

Dan April 12, 2006 1:07 PM

A few comments, Oli. Rotating the images is useless if a computer can determine how to un-rotate them. Then it just becomes a trivial image-compare problem. You have to make sure to at minimum crop the obvious borders.

Secondly, if it is still serving up the same ‘slots’ for every refresh, a harvesting attack will trivially catalog your database into group A and group B, and since it knows three-of-nine are kittens, the problem is solved.

Thirdly, if I normalize the brightness on the images I can likely use something as simple as a histogram to ‘recognize’ images. It’s going to take a lot of distortion to make it difficult for a machine to recognize, and it’s already difficult for humans due to size and graininess.

Pat Cahalan April 12, 2006 3:07 PM

@ Dan

Cropping the image by a few pixels and rotating it will cause a lot of problems for image recognition programs.

There’s a reason your Aibo can’t bring you a beer. It’s very very difficult to have a robot/computer analyze a visual representation, especially in multiple dimensions.

raphael April 13, 2006 6:39 AM

The idea is interesting but as previously stated, I’m not sure that
would be so difficult for a computer vision software to recognize
images of kittens. Recent techniques in automatic image
classification/object recognition community are quite robust to
orientation, scale, and illumination changes as well as occlusion and
cluttered backgrounds. Papers in recent conferences like CVPR 2005 or
ICCV 2005 show nice systems doing the automatic classification of horses,
cows, dogs, … with quite good recognition rates.

So I could imagine a system that:

  • first collect manually from Google/Flickr a lot of images of kittens
    and images of non-kittens. eventually, generates randomly cropped/rotated
    versions of these with image processing techniques. there are also recent
    scientific papers that automatically filters Google results.
  • learn an automatic image classification model (with for example
    the PiXiT software by Marée et al.
  • interface it with a web proxy (like RabbIT that sends the image to the
    classification model software and gets back if it is a kitten or not
    with a certain score. this score could help to pick automatically images
    of kittens.
  • this could be useful by blind people too 😉

if you made it this far ... April 13, 2006 11:09 AM

I like it, because it increases diversity of auth schemes. This increases the work for the baddies, the more schemes there are, the more work for them to crack them and keep track of the cracks.

Remember that the cost vs payoff is part of the decision of whether to attack a target. This scheme is less well known, and so crackers will aim first at widely used schemes, such as the captcha scheme, which someone here already described a brilliant crack for (re-using the captcha image to guard some porn page, and using the human who wants the porn to give you the answer to the captcha .. wash rinse repeate…)

And most importantly, this is simple to implement, and can be customised so that for your particular site, the cracker needs to customise their crack.

When the auth scheme is broken, you haven’t invested much time in developing it, so develop another.

if you made it this far April 13, 2006 11:13 AM

and there is a payoff if anyone does crack it properly.

Auth schemes on websites are generally to protect from mass cracking. You normally don’t mind if one or two get through – if need be, their spam can be cleaned up by hand.

But if 100 are spamming you, it shuts down the site.

In this case, if the crack is widely distributed, it means someone has published a free and accurate (and usually open source) computer vision app 😀 — crackers rarely apply valid licensing schemes to their software .. although I guess there’s the slim chance they pirate an existing computer vision implementation…

Rod Divilbiss April 13, 2006 4:20 PM

No CAPTCHA will be successful at stopping a determined hacker who knows how to write a good program. ANY CAPTCHA has the potential to lock out legitimate users or simply annoy people when it is made increasingly complicated to attempt to stay ahead of the determined hackers.

I’m sick of the increasingly complicated CAPTCHA’s which attempt to stay ahead of the hackers by distorting characters. Humans are very good at recognizing complex patterns such as kittens. Programs find this difficult.

So we choose characters, which are easy to recognize by a program and attempt to defeat the program by distorting them beyond human recognition.

Give me the kittens. 99% of the CAPTCHA’s in use could be a simpler method such as this and be much less annoying to people.

Images, well choosen, will be easier for color blind and visually impared users to recognize, although nearly any visual CAPTCHA still penalizes the completely blind user.

The article incorrectly implys ASP can not stream images for this purpose.

Here is an example, slightly improved with more image choices using pure ASP with not third party components.

Ryan FB April 13, 2006 11:23 PM

I had only seen this CAPTCHA mentioned one other place and didn’t realize there was so much discussion about it, but I was able to defeat this in around an hour over the weekend. With exams and all I hadn’t had a chance to post anything about it, but you can read about it now:

Based on what Oli has said about the next version so far, this attack would still work against it.

Oli July 8, 2006 8:39 PM

That last comment from “Trisha” (the bot) is just a reminder on how important human-authentication really is… Such a pitty.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.