Analyzing CAPTCHAs
New research: “Attacks and Design of Image Recognition CAPTCHAs.”
Abstract. We systematically study the design of image recognition CAPTCHAs (IRCs) in this paper. We first review and examine all IRCs schemes known to us and evaluate each scheme against the practical requirements in CAPTCHA applications, particularly in large-scale real-life applications such as Gmail and Hotmail. Then we present a security analysis of the representative schemes we have identified. For the schemes that remain unbroken, we present our novel attacks. For the schemes for which known attacks are available, we propose a theoretical explanation why those schemes have failed. Next, we provide a simple but novel framework for guiding the design of robust IRCs. Then we propose an innovative IRC called Cortcha that is scalable to meet the requirements of large-scale applications. Cortcha relies on recognizing an object by exploiting its surrounding context, a task that humans can perform well but computers cannot. An infinite number of types of objects can be used to generate challenges, which can effectively disable the learning process in machine learning attacks. Cortcha does not require the images in its image database to be labeled. Image collection and CAPTCHA generation can be fully automated. Our usability studies indicate that, compared with Google’s text CAPTCHA, Cortcha yields a slightly higher human accuracy rate but on average takes more time to solve a challenge.
The paper attacks IMAGINATION (designed at Penn State around 2005) and ARTiFACIAL (designed at MSR Redmond around 2004).
Clive Robinson • October 5, 2010 8:02 AM
I’m glad the authors of the paper realise how CPATCHAs actually work as,
“a task that humans can perform well but computers cannot”
It has however two issues involved with it,
1, Computers are and will get better at these “human” taks.
2, Attackers have avoided the wait on 1 by employing humans to do the job.
It is the second issue that needs to be fixed or accounted for in any new design (which they don’t appear to have addressed).
The real solution in this area is to use not just context sensitive with respect to the image but context sensitive with regards to the individual for these systems to have a security value above 30cents (the current going rate for Chinese and other “Capatcher droids”.
Yes there are ways this can be done but all those that are currently known “academicaly” need some kind of centrralised authority in one form or another which precludes anonymous activities.