Schneier on Security
A blog covering security and security technology.
« Diamond Swallowing as a Ruse |
| Analysis of PIN Data »
September 19, 2012
Recent Developments in Password Cracking
A recent Ars Technica article made the point that password crackers are getting better, and therefore passwords are getting weaker. It's not just computing speed; we now have many databases of actual passwords we can use to create dictionaries of common passwords, or common password-generation techniques. (Example: dictionary word plus a single digit.)
This really isn't anything new. I wrote about it in 2007. Even so, the article has caused a bit of a stir since it was published. I didn't blog about it then, because I was waiting for Joe Bonneau to comment. He has, in a two-part blog post that's well worth reading.
Password cracking can be evaluated on two nearly independent axes: power (the ability to check a large number of guesses quickly and cheaply using optimized software, GPUs, FPGAs, and so on) and efficiency (the ability to generate large lists of candidate passwords accurately ranked by real-world likelihood using sophisticated models). It's relatively simple to measure cracking power in units of hashes evaluated per second or hashes per second per unit cost. There are details to account for, like the complexity of the hash being evaluated, but this problem is generally similar to cryptographic brute force against unknown (random) keys and power is generally increasing exponentially in tune with Moore's law. The move to hardware-based cracking has enabled well-documented orders-of-magnitude speedups.
Cracking efficiency, by contrast, is rarely measured well.
Finally, there are two basic schemes for choosing secure passwords: the Schneier scheme and the XKCD scheme.
Posted on September 19, 2012 at 4:41 AM
• 57 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
For me (disclaimer: I am a pen-tester for the Fortune 10), I welcome any advancements in offensive-security. This is because offensive-security finds problems that can be fixed. Anything else is security masturbation.
The serious problem with passwords that I typically encounter is the use of a "first-day password" such as "changeme" or "changeme123". While I agree with Joe Bonneau's chess-style (e.g. "time" and "space" strategic advantages) oversimplification of the problem with passwords, I definitely suggest that the problem is much larger than a two-axis one. Operational factors such as "first-day passwords" can exacerbate either of these issues.
Weighted password-cracking lists and custom weighted password lists have been around since passwords existed. The only reason we have more research into these weighted lists recently is because of many of the high-profile websites that have had their unencrypted password databases published to the public by adversaries.
No, this really isn't anything new, and there have been articles and even books about this before as well.
Kudos to Dan Goodin for referring to Cuckoo's Egg by Cliff Stoll, a book that should be mandatory reading for any infosecurity professional even today.
Josephn Bonneau had some good and interesting points in his blog posts, and I really hope he will continue to do research and publish within this area.
To those who find passwords more interesting than the average person, we're going to discuss nothing but passwords & PINs at Passwords^12 in Oslo, december 3-5. Atom, author of password cracking tool 'hashcat' will reveal a flaw in SHA1 enabling him to crack SHA1 hashes 15-20% faster than anyone else currently does.
[blush]Self-publicity:[/blush] storing hashes of passwords is no longer an honest option (as is well known).
Take a look a the Sibyl algorithm & implementation if you want how to RSA-secure your password hashes.
Also, a lot of sites won't allow the XKCD approach: an ISP that will reject any password that contains what looks to it like a real word, including games with capitalization or replacing letters with digits. Mine turned down a password that started with "Pl4c0derm" because it's dictionary contains "placoderm." Others have explicit or effective length limits: something like "correcthorsebatterystaple" might be a good password, but "correcth" isn't.
Rejecting them outright and pushing people to the Schneier approach is reasonable; pretending to accept them and only storing or testing against the first 8 or 10 characters is not. Which everyone here knows, but some of the people controlling security don't.
My understanding is that XKCD's famous password advice - basically 4 or so unmodified dictionary words back to back - is now broken as it assumes brute forcing is done character by character.
In fact, as the excellent Ars article makes clear, BF tools are increasingly using pattern logic ("hybrid") which looks at common patterns like one or more dictionary words (along with things like numbers at end, capitalised first letter, common substitutions) before falling back to char by char as last resort.
Thus a string of unmodified dictionary words is going to fall reasonably quickly. The XKCD approach also has undesired effect of limiting the search space to alpha chars only- even sticking some numeric / char separators would help defeat pattern matching.
Of course, have to keep in mind the XKCD strip was also written for humourous effect :) That still stands even if the advice is arguably shaky due to advancements.
Don't most systems restrict innately the number of tries you can have for guessing the password? doesn't matter how many zillion tries can your computer make, if the system shuts down after 5 wrong tries...
What kind of passwords can you brute force nowadays, beside zip archives in your hd? :)
Neogic: The XKCD calculation appears to be based on the assumption that there are about 2000 common words which have been randomly selected from. Assuming that the attacker knows which these are (not particularly unreasonable), and that the user has really chosen at random (much dodgier), then the advice is not broken.
I use diceware to secure the second assumption.
The "XKCD method", as implemented on diceware.com, assumes the attacker knows the dictionary and knows that this method was used to generate the password, and is intended to be secure despite this.
Did anyone else subscribing to the Schneier RSS feed get a "Friday Squid Blogging: Octonaut" message... yesterday (Tuesday)?
Why are web sites still using hashing algorithms, which are designed to be fast? Wouldn't it be much better to use a slow algorithm together with an obscure password? Much better than investing in two-factor authentication.
The XKCD assumptions strike me as wildly unrealistic although the advice, if followed, would probably result in a dramatic increase in entropy over what most people actually do now.
Using HashCat and GPU to crack Sony salted SHA1 hashed password database:
Whitepixel 8 GPU hash cracking machine:
"Why are web sites still using hashing algorithms, which are designed to be fast? "
Good question. Web sites should be using key derivation functions or some type of strengthening that dramatically slows down the speed of an attack on a stolen password database. But most don't because most of the the people that build web sites understand design and not security. It's actually a wonder that they even bother to hash them.
Is using SRPbetter than storing bcrypt-ed (or scrypt-ed or PBKDF2-ed) passwords? Why?
Gibson's method mentioned above is I think the next good approach. Using just the first letters from a sentence is probably less secure, unless the sentence is very obscure--some cracker now try out common phrases, see http://www.lightbluetouchpaper.org/2011/11/08/... for an example. Length appears to be the primary factor, and a combination of sentence letters and padding to 15 characters should yield a strong and easy to remember password. I also recommend use of a vault so you don't have to remember passwords to banking sites, etc. 1Password, for example, does use repetitive hashing to slow down brute force should the vault fall into the wrong hands.
Web sites use hashing algorithms, which are designed to be fast, so they *can* be fast.
If they need a rate-limiting feature, they would rather design that in as a sleep, during which they could serve other web pages rather than expensively hashing 'password1'
If the password or password hash database is stolen, it isn't the strength of the user's passwords that is the issue.
@Kent: "Why are web sites still using hashing algorithms, which are designed to be fast?"
It doesn't matter how fast the has function is, since you can make the actual algorithm as slow as you want by iterating it until you have achieved the desired duration.
Personally, I like to use a combination of words from long-dead or obscure languages and numbers. The words won't be in any dictionary that a cracker is likely to use (Sanskrit or Mayan anyone?), which isn't to say that after reading this some enterprising cracker won't go to the trouble of creating Sanskrit and Mayan (or Australian Aborigine, Navajo, Chippewa,... ) dictionaries. :-)
Yes, if you're going to list password schemes, you should include the Gibson scheme too.
@curioustom they don't brute force login boxes on webpages... what is being bruteforced are password hashes stolen from databases. Each password will have a unique hash, if you you can keep generating and hashing passwords until you find a matching hash - you know what the actual password is. As for XKCD algorithm, now that someone has a reason to try it - all they need is a dictionary based, multiple words separated by spaces, attack to effectively turn this into a 4-character password... Besides, very few websites will let you choose a password like that - it will be too long and/or not secure enough.
curioustom: You're correct, many systems will lockout after a number of bad login attempts, or will progressively delay the logins, making a brute-force attack against the front door impractical. That's not the real problem, though: what we're increasingly seeing are instances where the login credentials are being captured, either at low volumes (traffic interception, network monitoring, etc.) or high volumes (user credential database stolen by exploit, insider attack, etc.)
In these cases the attacker can mount an offline brute-force attack against those credentials to try to decrypt them at his leisure, bringing to bear all the computing power at his disposal.
@Alex W: It's not even close to "effectively a 4-character password." As was mentioned earlier, the math was based on 2000 common words in the available pool (actual number in some implementations can vary). If your keyboard has access to 2000 printable characters, I'd like to see a picture of it.
This implementation uses a pool of 1949 words. Since there are four words, and they're chosen independently from each other, there are 19494 = 14,429,369,557,201 possible passwords, which is slightly over 43 bits of security. (To get 44 bits you'd need to bump up to a 2048-word pool.) A computer brute-forcing at 100,000 passwords per second would take 4.5 years to crack this, which is more than secure enough for most users' needs.
If it's not secure enough for you, you can add a digit to the end (adds about 3.3 bits of security), randomly capitalize the first letters of the words (adds 4 bits), or just add a fifth word (adds almost 11 bits). Of course, you actually do have to have random words, generated by a computer. The first four words that pop into your head are very low entropy.
The best option, of course, is to use a program like Password Safe or KeePass to generate and store truly random passwords, and secure that with an XKCD-style password; that way you only have to remember one password while still using a different password for every site.
Our team at Carnegie Mellon had a fairly recent paper on the usability of xkcd-like schemes, where passphrases are assigned (rather than self-selected). Long story short, the results we got were disappointing (i.e., it is not very usable) and indicate this type of scheme is very far from a panacea.
R. Shay, P. Kelley, S. Komanduri, M. Mazurek, B. Ur, T. Vidas, L. Bauer, N. Christin and L. Cranor. Correct horse battery staple: Exploring the usability of system-assigned passphrases. In Proc. of the 8th Symposium on Usable Privacy and Security (SOUPS'12). Washington, DC. July 2012.
@curioustom Yes, many systems limit the number of attempts at a login.
One major insecurity occurs when a hacker gets access to the back end infrastructure at a site where all tyhepasswords can be pilfered from one place, even though they are hashed. An intruder can work offline to break the password and match the hash, and then gain authenticated access to the account, OR log into the users account at other sites because the user reused their password elsewhere.
I generally try to keep my passwords hard-to-guess, but as a result I'm frequently thwarted by the 3-fails-and-we-lock-your-account security feature. I often don't remember my passwords for daily use; I just type them using "muscle memory". So when I'm in a hurry, I tend to mistype them, and after 3 failed attempts, I'm locked out.
The thing that puzzles me: why 3? Seems like such an arbitrary number. Statistically-speaking, a password-guesser doesn't have that much better chance of guessing my password in 100 guesses than it does in 3. If it locked me out after 100 failed attempts, it wouldn't be any less secure. So how was 3 originally determined to be enough tries for a person to get their password right? Was it even devised intelligently, or was it selected because it "just seemed natural?" Like 3 wishes? 3 blind mice? 3 strikes and you're out? This isn't baseball.
@boog Usually sites either implement a scaling delay (i.e. first failure requires you to wait one second, next requires two seconds, next four, and so on) or require a captcha after three failures.
Ha, you know your pw is good when you lock yourself out of your acct.
Never underestimate the power of 3. :)
The issue with any of those password schemes is that they do not encourage users to use a unique password for each site. I personally use something similar to the "Schneier scheme" mentioned, but also incorporate information about the site to make it unique. As a simple example, a password for a banking site might have a '$' inserted in a specific position. A password for a Gmail account could have "gm" appended and "ail" prepended.
The "Schneier scheme" is incredibly useful, however, in providing answers to security questions that couldn't be found with a quick 10-minute search and look at my Facebook page. Something like "What was your first car?" could result in an answer of mFcWaC1995hA - "my first car was a crappy 1995 Honda Accord".
Note that neither of these are the exact scheme that I use, as it would be silly to give that away.
Sadly, my experience indicates otherwise, at least for things that matter (banks, credit cards, student loans, utilities, etc.). At my job, even Active Directory will lock me out after 3 failed attempts.
I like to secure my accounts from myself. Otherwise I might successfully log in and be forced to engage in such excruciating activities as paying bills or doing work.
The simplest way to guarantee that passwords can't be cracked is to generate them randomly. Granted, this doesn't help when the back end quietly truncates to eight characters without notice, but there's nothing one can do about that aspect of things.
And no, I don't remember all these passwords. My password manager does that for me. I have to remember only one of them.
The big advantage to this, as opposed to an algorithm relating to site information, is that recovering from a compromised password is "easy". Generate new password, update password, done. Using an algorithm by site, on the other hand, is equivalent to building a new cipher for each message.
Obviously Moore's (misstated) Law will soon catch up even with 32-character passwords, but for the next few years at least it SHOULD be enough.
Just out of curiosity, does anyone know of any accessible (including being in English) reports on how Chinese computer users are with passwords?
With each character being a word, I would imagine that something akin to the XKCD password algorithm would work well for them, but would also be less secure if the most common characters are constantly chosen.
I'm guessing here, but wouldn't deliberate misspellings be very difficult to achieve?
What effects would all this have on password cracking?
Was wondering what you thought about LastPass. I like it since my passwords sync to all of my devices (they send everything in encrypted format and hash with a strong algorithm as well), but I know that this poses risks.
Still, I think that the security gained by having a different randomized long password for every web service that I use outweighs the risks of my hashed passwords being stored on LastPass' servers.
I'm not sure if you've looked at them carefully or not, but really want to know what you think.
Great educational post. I guess the "basic two ways of choosing secure passwords" should be mandatorily teached in primary school!
Owing to the conviction power of graphs and metaphors, it is XKCD's receipt that got my vote, but I am attracted by the elegance of yours just as much.
Thanks for your time and explanations!
@Paeniteo: Exactly, and since 'adding iterations' can be done on a system with an existing password database without changing the passwords, why don't designers pick a target processing time for their password hashing algorithm, and add iterations every so often when they upgrade their servers? This upgrade path would also allow an easily applied temporary patch if minor weaknesses were found in their hash algorithm.
If nothing else, it extends the time between DB compromise and attacks using cracked passwords, at least for users who made adequate password choices.
Thx Alex, cbarn and Feedback for a good explanations!
> Granted, this doesn't help when the
> back end quietly truncates to eight
> characters without notice
People still use Crypt? Are there really that many Solaris 7 boxes still around?
I did fwd that Ars article to some of my colleagues, but the reaction was generally, meh too long...
What about this scheme (a bit like salting), you have just one strong password, say 'abcdefg' (NOT real example!)
And then for each site you set your password as follows (say for Gmail):
I'm a bit paranoid I guess as I don't trust Passwd Safe/ Keepass/ Lastpass etc.
Just don't use Toki Pona
Looks like yet another Rule-Of-Three example
From my personal experience (with Chinese and other people) they tend to use standard character set (not every system can handle non-ASCII symbols correctly), however dictionary is somewhat larger due to romanization and "wrong keyboard layout" technique
You can compile Keepass yourself... BUT then you have to trust your compiler.
@Neogic No, XKCD advice doesn't assume char by char attack. It assumes dictionary attack. My own calculus gives similar results to Randy's.
If you have a set of 1000 common words (he assumes 2000, but it sounds too high for me), using 4 of them would mean 100^4=10^12 options, that would give you 40 bits of entropy (44 according to Randy).
If you have a set of 10.000 uncommon words, and 8 common variations to it, plus two truly random characters, you'll have 8x10.000x64^2=327.680.000 options (about 28 bits of entropy).
"You can compile Keepass yourself... BUT then you have to trust your compiler"
Well first I have to trust the source code.. Or have the ability & time to read thru the entire source for bugs/ trapdoors...
That reminds me... it's time to order a Yubikey.
I ordered a Yubikey a couple of months ago. I'm still experimenting with it but definitely worth the $25.
Google Authenticator is another cheap way to get password + OTP on Google and other services, like LastPass.
No, the XKCD scheme IS broken. You said it yourself: assume 2000 words. That's approximately 11 bits. Now combine four of those, randomly selecting from the 2000 words. Okay, you're up to a maximum of about 44 bits of entropy. That's not much, anyone with the list could easily brute-force that. Checking all of the variations with numbers subbed in etc. would take longer, but still be reasonably practical.
If you assume a dictionary of 60000 words instead, you're still talking about less than 64 bits of entropy, which is better than most 8-12 character passwords but still well within the realm of brute force.
Maybe if you choose six to eight words instead, if would be decent, but if they are *random* words thats a lot of weird crap to remember. Any phrases with English-language correlation in them will have significantly less entropy.
Adding to your comments on the XKCD scheme:
1. The XKCD scheme assumes a really slow guessing rate of 1000 guesses/sec. Real world guessing rates are usually much faster (and getting faster):
2. The scheme doesn't provide a method for random generation of words. And are random words really as easy to remember as is suggested? It is maybe no surprise that people often use non-random phrases which are easily recovered e.g. http://securitynirvana.blogspot.co.uk/2012/06/...
3. Diceware does provide a method for random generation but their base entropy recommendations start at around 65 bits or a minimum of 5 of their dictionary words (and that part of their FAQ maybe needs updating...). For anything worth protecting a minimum of 6 or 7 words is more reasonable. So at best a user will end up with something 'memorable' like: algerklmcurryblondpuck or herasteamslopaimjoindel
The comic does specifically address that first point:
"Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about."
You secure yourself against the first threat. The website secures itself against the second threat. If a password database is exposed, you don't think "Oh it's cool, my password has 160 bits of security, it'll never be cracked." You change your damn password.
"Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about."
Well put sir. It seems like most debates over the bulletproof-ness of the "XKCD scheme" tend to overlook at least three things:
- The part you quoted above.
- Nobody claimed the scheme was bulletproof; Randall just used it to refute the idea that the password-guessing ability of computers correlates with complexity as perceived by humans, a common assumption as shown by many corporate password policies.
- It's a freaking comic.
You're not accounting for capitalization, punctuation, etc. For example:
- ...and so forth
Sounds interesting, and what is your daily job like as a PT for a Fortune 10?
I don't understand how a conversation about passwords and xkcd can make it this far without a reference to http://xkcd.com/538/
I wouldn't say this for the average joe, but for anyone who cares enough to read this forum, it's really not that hard to memorize 8-12 digit random alphanumeric+oddchar strings for passwords you use daily or multiple times a day.
Generate them via an admittedly pseudorandom mechanism of your choosing, but permute/tweak that output in a way that varies each time by whim so that anyone intercepting or predicting the generator results can't use them against you.
Write your new odd password down (further encoded if paranoia dictates) and stick it in your wallet/purse and pull it out when you forget it during the first week... but you'll know it a week or even a few days later enough you can even (securely) discard or put your copy in a safer offline place.
@Neogic It does NOT assume brute force is done character by character. I don't know how you even got that conclusion - the asserted difficulty is assuming a password cracking tool that _exclusively_ tries sets of four words from the same dictionary that was used to generate the password.
There have been lots of major breaches involving the theft of password databases over the last couple of years.
Yes, you'll change the password, assuming you know the database is exposed before any damage is done.
Yeah, but it gets cited all over the place in discussions about password creation.
Yeah, but it gets cited all over the place in discussions about password creation.
As it should. The average user assumes that a password that is hard for humans to guess or remember will therefore be hard for computers to guess. As the comic shows, that isn't always the case, and as the comic explains, we've spent 20 years training people to think this way.
So I'm glad the comic is cited in discussions about password creation. Hopefully the folks who invent goofy password policies will read it and realize all their silly rules don't actually help the situation.
@GregW Are you really suggesting that I use one password across multiple services?
Just because we're talking about securing services that look after one's password properly doesn't mean that all of them do. Every week, another service is found to be storing passwords as plaintext (this week it's Pandora).
@bob: No, I was just saying that for any place where you have to use a password daily, someone passionate about security issues can memorize a random 8-12 char password for it without too much trouble. (I have five-six of those at any one time for various sites or classes of sites (where members of a class have equivalent/similar security properties, something I could elaborate on more if people care.))
Frankly, I don't think it's a big risk to use the same password across multiple consumer throw-away sites that I don't care about (e.g. Pandora), as long as you keep your email, work, banking, etc passwords totally distinctly different. I suppose if I thought the threat model for that contained significant risks, I might use PasswordSafe or something similar, but I am a bit distrustful of adding "one ring to rule them all" for my main passwords.
I thought someone was going to object that 8 characters were pretty vulnerable to rainbow tables and that's generally true, but that is why I put in the caveat "+odd char".
@boog: Angela Sasse at UCL has done some research on where the three-tries-and-you're-locked-out thing comes from, and as far as she can tell it came from baseball's three strikes. Their research shows that by raising the 3 to 6-9 the number of users you have to support (with resets after lockouts, etc) drops to a tiny percentage. (Without, I think, particularly compromising the system's overall security.)
@Wendy M. Grossman
Interesting stuff, thanks! I bet you could raise it to many more tries and still not particularly compromise the system's overall security.
My company recently set up some kind of password-reset application in order to reduce support calls due to locked accounts. After you opt in (I didn't), you can use some kind of identifying information (like the last 4 of your SSN) to unlock your account and reset your password. I'm curious what this system cost, when raising 3-strikes to 10-strikes would cost only a few minutes of one person's time and achieve the same reduction in support calls without undermining the company's security.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.