Choosing Secure Passwords

As insecure as passwords generally are, they’re not going away anytime soon. Every year you have more and more passwords to deal with, and every year they get easier and easier to break. You need a strategy.

The best way to explain how to choose a good password is to explain how they’re broken. The general attack model is what’s known as an offline password-guessing attack. In this scenario, the attacker gets a file of encrypted passwords from somewhere people want to authenticate to. His goal is to turn that encrypted file into unencrypted passwords he can use to authenticate himself. He does this by guessing passwords, and then seeing if they’re correct. He can try guesses as fast as his computer will process them—and he can parallelize the attack—and gets immediate confirmation if he guesses correctly. Yes, there are ways to foil this attack, and that’s why we can still have four-digit PINs on ATM cards, but it’s the correct model for breaking passwords.

There are commercial programs that do password cracking, sold primarily to police departments. There are also hacker tools that do the same thing. And they’re really good.

The efficiency of password cracking depends on two largely independent things: power and efficiency.

Power is simply computing power. As computers have become faster, they’re able to test more passwords per second; one program advertises eight million per second. These crackers might run for days, on many machines simultaneously. For a high-profile police case, they might run for months.

Efficiency is the ability to guess passwords cleverly. It doesn’t make sense to run through every eight-letter combination from “aaaaaaaa” to “zzzzzzzz” in order. That’s 200 billion possible passwords, most of them very unlikely. Password crackers try the most common passwords first.

A typical password consists of a root plus an appendage. The root isn’t necessarily a dictionary word, but it’s usually something pronounceable. An appendage is either a suffix (90% of the time) or a prefix (10% of the time). One cracking program I saw started with a dictionary of about 1,000 common passwords, things like “letmein,” “temp,” “123456,” and so on. Then it tested them each with about 100 common suffix appendages: “1,” “4u,” “69,” “abc,” “!,” and so on. It recovered about a quarter of all passwords with just these 100,000 combinations.

Crackers use different dictionaries: English words, names, foreign words, phonetic patterns and so on for roots; two digits, dates, single symbols and so on for appendages. They run the dictionaries with various capitalizations and common substitutions: “$” for “s”, “@” for “a,” “1” for “l” and so on. This guessing strategy quickly breaks about two-thirds of all passwords.

Modern password crackers combine different words from their dictionaries:

What was remarkable about all three cracking sessions were the types of plains that got revealed. They included passcodes such as “k1araj0hns0n,” “Sh1a-labe0uf,” “Apr!l221973,” “Qbesancon321,” “DG091101%,” “@Yourmom69,” “ilovetofunot,” “windermere2313,” “tmdmmj17,” and “BandGeek2014.” Also included in the list: “all of the lights” (yes, spaces are allowed on many sites), “i hate hackers,” “allineedislove,” “ilovemySister31,” “iloveyousomuch,” “Philippians4:13,” “Philippians4:6-7,” and “qeadzcwrsfxv1331.” “gonefishing1125” was another password Steube saw appear on his computer screen. Seconds after it was cracked, he noted, “You won’t ever find it using brute force.”

This is why the oft-cited XKCD scheme for generating passwords—string together individual words like “correcthorsebatterystaple”—is no longer good advice. The password crackers are on to this trick.

The attacker will feed any personal information he has access to about the password creator into the password crackers. A good password cracker will test names and addresses from the address book, meaningful dates, and any other personal information it has. Postal codes are common appendages. If it can, the guesser will index the target hard drive and create a dictionary that includes every printable string, including deleted files. If you ever saved an e-mail with your password, or kept it in an obscure file somewhere, or if your program ever stored it in memory, this process will grab it. And it will speed the process of recovering your password.

Last year, Ars Technica gave three experts a 16,000-entry encrypted password file, and asked them to break as many as possible. The winner got 90% of them, the loser 62%—in a few hours. It’s the same sort of thing we saw in 2012, 2007, and earlier. If there’s any new news, it’s that this kind of thing is getting easier faster than people think.

Pretty much anything that can be remembered can be cracked.

There’s still one scheme that works. Back in 2008, I described the “Schneier scheme”:

So if you want your password to be hard to guess, you should choose something that this process will miss. My advice is to take a sentence and turn it into a password. Something like “This little piggy went to market” might become “tlpWENT2m”. That nine-character password won’t be in anyone’s dictionary. Of course, don’t use this one, because I’ve written about it. Choose your own sentence—something personal.

Here are some examples:

  • WIw7,mstmsritt… = When I was seven, my sister threw my stuffed rabbit in the toilet.
  • Wow…doestcst = Wow, does that couch smell terrible.
  • Ltime@go-inag~faaa! = Long time ago in a galaxy not far away at all.
  • uTVM,TPw55:utvm,tpwstillsecure = Until this very moment, these passwords were still secure.

You get the idea. Combine a personally memorable sentence with some personally memorable tricks to modify that sentence into a password to create a lengthy password. Of course, the site has to accept all of those non-alpha-numeric characters and an arbitrarily long password. Otherwise, it’s much harder.

Even better is to use random unmemorable alphanumeric passwords (with symbols, if the site will allow them), and a password manager like Password Safe to create and store them. Password Safe includes a random password generation function. Tell it how many characters you want—twelve is my default—and it’ll give you passwords like y.)v_|.7)7Bl, B3h4_[%}kgv), and QG6,FN4nFAm_. The program supports cut and paste, so you’re not actually typing those characters very much. I’m recommending Password Safe for Windows because I wrote the first version, know the person currently in charge of the code, and trust its security. There are ports of Password Safe to other OSs, but I had nothing to do with those. There are also other password managers out there, if you want to shop around.

There’s more to passwords than simply choosing a good one:

  1. Never reuse a password you care about. Even if you choose a secure password, the site it’s for could leak it because of its own incompetence. You don’t want someone who gets your password for one application or site to be able to use it for another.
  2. Don’t bother updating your password regularly. Sites that require 90-day—or whatever—password upgrades do more harm than good. Unless you think your password might be compromised, don’t change it.
  3. Beware the “secret question.” You don’t want a backup system for when you forget your password to be easier to break than your password. Really, it’s smart to use a password manager. Or to write your passwords down on a piece of paper and secure that piece of paper.
  4. One more piece of advice: if a site offers two-factor authentication, seriously consider using it. It’s almost certainly a security improvement.

This essay previously appeared on BoingBoing.

Posted on March 3, 2014 at 7:48 AM235 Comments

Comments

Ted Lilley March 3, 2014 8:04 AM

Thank you for debunking the xkcd passphrase approach.

Xkcd is pretty smart, but that was an unfortunate and dangerously popular miscalculation on his part.

Jim March 3, 2014 8:07 AM

What about Diceware? Seems like that’s a bad idea now, given that the dictionary is known.

Eric Riley March 3, 2014 8:12 AM

At work, I am required to change my password every 90 days – to generate the next password, I take the previous and advance each letter (or number) by 1: ‘z’ would go to zero, not ‘a’. I then move each number forward its number of spaces (if there’s already a number there, I go to the next free space). I sum the numbers and counting only free spaces, the one I end on with the sum is the capital letter next round. I finally put in the letters in the same order as they appeared, starting with the next free space to the right.

I find 10 is about as long as I can remember easily, and it helps that when password update comes along I have to use it about a dozen times to get everything changed (and most everything I have access to must use the same password for some of the automated processes to work).

On ‘password strength’ checkers, I usually get a pretty high rating, but I have my doubts as to just how strong it is… (Oh – and I do have some idiosyncrasies that are part of the process since I can’t remember exactly what I did after three months, I may do things in a slightly different order or decide to throw in two capitals or allow the number to shift or something like that).

Wondering March 3, 2014 8:18 AM

Is Sourceforge still wrapping/appending malware to downloads? That would be a bit of a problem for trusting a tool such as Password Safe.

Simon March 3, 2014 8:20 AM

The mistake XKCD made is in thinking worrying about stolen hashes is not something for the average user to worry about. These days I get emailed each time it becomes apparent that one of my hashes has been leaked.

I fear the Schneier scheme may suffer similar fate, it depends on humans ability to pick obscure, and obscure it further, and we are really bad at that sort of thing. Now machine generated randomness, whilst not perfect, is better understood than people.

Wilson March 3, 2014 8:22 AM

I’m still unable to understand how efficiency can get around entropy: if there are 2^44 (or so, mi guess is 2^39) different XKCD style password and you truly chose a random one, there will be noting but power (or sheer luck) that can help to crack that, even for an attacker that know exactly how you generated the password (plus, you can use uncommon words, adding lot of bits of entropy).

On the opposite the Schneier style seems more prone to specific analysis (how many personal sentence there are? thousands? millions? can you create a table of those? Will most people think of the same few or will they be more original?).

Where am I wrong?

Remko Tronçon March 3, 2014 8:24 AM

Can someone explain me why diceware and/or the XKCD method (let’s say 5 words instead of 4) is a bad method?

Assuming the hacker knows you take 5 words out of a very limited list of 7000 words, wouldn’t that give you 65 bits entropy, which would be as good as a 12 truly random alphanumeric password. Just from intuition, this sounds better than any of those passwords suggested in this post (which can be computed from applying some transformation on a bunch of words that form a gramatically correct sentence).

Or am I making a miscalculation here?

Wholly unconvinced March 3, 2014 8:30 AM

How is the XKCD method unsafe ? It doesn’t matter if an attacker knows you are using this method, the number of possible passwords is still N^4. With a 100k word dictionary (the rough number of lines in my UNIX dict file), 4 words get you 64 bits of entropy. With 8 words, you get 128 bits. Assuming the attacker knows you use this method (but not which words, of course).

Kahomono March 3, 2014 8:32 AM

7,000 words is a very poor starting pool for the XKCD method. I made a spreadsheet tool with a starting list of about 55,000 words – taken from an online version of SOWPODS. The tool randomly chooses 25 of them every time I refresh it, and then I mentally choose four to be my next password.

Memorization is still easy.

Mario Diana March 3, 2014 8:35 AM

On the subject of “password strength” checkers, I get the feeling that the criteria they use is some arbitrary notion of what a strong password is, rather than anything based by research on how password crackers actually work.

Jimmy T March 3, 2014 8:43 AM

It sounds like the XKCD method is only a problem if the 4-word list is not generated randomly.

Peter Boughton March 3, 2014 8:43 AM

zxcvbn1 gives a rather short 4 hours for “gonefishing1125” – which isn’t surprising; two dictionary words and a number isn’t secure. Suggesting it wont be found by “brute force” seems to be a pedantic interpretation of what brute force is?

My advice is generally “use a meaningless sentence with random punctuation and at least one uncommon typo” – that’s far easier to explain to someone non-technical than the “Schneier scheme”. Is there any reason it’s not valid advice?

Wilson March 3, 2014 8:46 AM

@kahomono I’m not sure is a safe thing to choose from a random array: it can lower the number of real password dramatically (and some will come up more often than others). It may be better to eliminate the words you don’t want to use and have the dice do the whole choice.

Peter A. March 3, 2014 8:49 AM

There are still workplaces that severely limit password length and the set of characters that can be used – and use the same password for many systems – sometimes to the point that enumerating the whole space would not be a big problem.

My current employer’s password policy guarantees there’s less than 48 bits of entropy in every password…

Kahomono March 3, 2014 8:53 AM

@wilson I don’t follow – I have 55,000 words to start. Why do you think my mental choice from among 25 is “lower[ing] the number of real password dramatically”? I decided on 25 as a balance between too-small and overwhelming. I mentally pick four from the 25 that slide into a memorable pattern in my brain. No vocalization or writing takes place during this process. My choice is not recorded in the tool, only where I am entering my new password.

The 25 words in the grid are chosen by a RAND() function with a flat distribution. So why do you think “some will come up more often than others”?

Todd March 3, 2014 8:56 AM

This is great; thank you.

Providing proper citation and a link to your blog, would it be acceptable to share on a corporate Intranet for the average layperson to see?

Fabio March 3, 2014 8:59 AM

I usually use a command line like this to create a different password for every web site I have a login:

echo -n myuniquepassword#schneier.com|sha1sum

Wilson March 3, 2014 9:09 AM

Some final set, non single components.

As you know, not all the set are at the same level of “memorability” and you will chose one of the better in the pool, that pool is of 303600 or P(25,4) possible result.

If you should be able to choose the better set, the decrease the total possible results set by 1/303600

You are not that good, so it’s better (maybe 100/303600)

Moreover, sets with higher memorability have more probability to be picked than others, so a clever attacker can make a tool that try those first.

It should be better to make your tool give you a list of five or ten or even twenty complete set and choose among those

ps: IMHO

Peter A. March 3, 2014 9:13 AM

@Fabio: the problem with this is once someone guesses your strategy (read: programs it into a password cracker) and the secret part (which may be easy or not), he gets access to ALL your web accounts…

John March 3, 2014 9:18 AM

The problem, I fear, with the Schneier approach, is that the first letters of english words are not randomly distributed across the alphabet. There are some letters that are much more common. If you look at the last sentence I wrote, for instance:

“There are some letters that are much more common.”
tasltammc

T and M are favored. Your entropy gets reduced. Or this one from the article:


WIw7,mstmsritt… = When I was seven, my sister threw my stuffed rabbit in the toilet.

Again, T is highly favored, as are M, S, and W. Most people will likely use a capital letter for the first or last letter, but none in between.

It’s better than what people are doing now, but we’re still human…

Peter A. March 3, 2014 9:18 AM

@Fabio: the problem with this is once someone guesses your strategy (read: programs it into a password cracker) and the secret part (which may be easy or not), he gets access to ALL your web accounts…

Autolykos March 3, 2014 9:34 AM

I fail to see what’s wrong with the XKCD method. It does not rely on the method itself being secret or the passwords generated by this method being long; it relies on words being way easier to memorize per bit of entropy than numbers, random characters or modifications/substitutions. The tradeoff is that you need to type in longer passwords, though (and can’t use it properly if password length is limited).
Bruce’s method is pretty good, too – it’s also very similar what the Chaos Computer Club has been advocating for years. It has less entropy for the amount of stuff you need to memorize, but more entropy per character you have to type – so what’s optimal for you depends on how good your memory is, how often you need to change your passwords, and how fast you type.

wiredog March 3, 2014 9:35 AM

A real annoyance is how many sites still limit how long the password can be. Limiting to 12 or 20 characters. Even more annoying is when they don’t tell you, and just truncate.

Kahomono March 3, 2014 9:37 AM

@wilson your calculation assumes two things that I don’t think are valid:

  1. There is an objective meaning of “memorable.” In other words, the “best” set of four to me is just the best set of four regardless of who is picking it.
  2. Note that I might not like anything the first time, and hit F9 to get new grids of 25 several times before I choose one.

Not to mention, I could hit F9 four times and take one word only from each result.

And I never save back the spreadsheet, there is no need.

Autolykos March 3, 2014 9:39 AM

I fail to see what’s wrong with the XKCD method. It does not rely on the method itself being secret or the passwords generated by this method being long; it relies on words being way easier to memorize per bit of entropy than numbers, random characters or modifications/substitutions. The tradeoff is that you need to type in longer passwords, though (and can’t use it properly if password length is limited).
Bruce’s method is pretty good, too – it’s also very similar what the Chaos Computer Club has been advocating for years. It has less entropy for the amount of stuff you need to memorize, but more entropy per character you have to type – so what’s optimal for you depends on how good your memory is, how often you need to change your passwords, and how fast you type.

Firas Salem March 3, 2014 9:41 AM

Nice post, but I beg to differ about the xkcd/diceware approach being misguided or obsolete.

An 8-word, randomly chosen diceware passphrase consistently provides a robust 104 bits of entropy, which I would be VERY surprised to see defeated in the next ten years.

Srix March 3, 2014 9:47 AM

If you are multilingual, try using non-english language transliterated to english as password.

Eg. My mother tongue is Tamil. using Nan2IdlyVangaPonen (which means “I went to buy 2 idly”) is relatively secure password.

Autolykos March 3, 2014 9:50 AM

I fail to see what’s wrong with the XKCD method. It does not rely on the method itself being secret or the passwords generated by this method being long; it relies on words being way easier to memorize per bit of entropy than numbers, random characters or modifications/substitutions. The tradeoff is that you need to type in longer passwords, though (and can’t use it properly if password length is limited).
Bruce’s method is pretty good, too – it’s also very similar what the Chaos Computer Club has been advocating for years. It has less entropy for the amount of stuff you need to memorize, but more entropy per character you have to type – so what’s optimal for you depends on how good your memory is, how often you need to change your passwords, and how fast you type.

Srix March 3, 2014 9:54 AM

If you are multilingual, try using non-English language sentence transliterated to English as password.

Eg. My mother tongue is Tamil. using Nan2IdlyVangaPonen (which means “I went to buy 2 idly”) is relatively secure password.

bitmonger March 3, 2014 10:02 AM

Randal from xkcd got it right.

People may misunderstand his advice and not get that he means ‘random word’ here (he does). He not saying ‘make up words you thing are random’.

Sure this can be cracked with a brute force rather than HMM like model, but I can also remember an 80+ bit password with this system. No way I’d manage an that with a traditional construction’ and frankly I don’t trust the entropy estimates for non-random constructions.

Wilson March 3, 2014 10:03 AM

@Kahomono:

I really don’t know, that is what I feel as the weakest point of your procedure.

Your two points seems good to me, but I’m not inclined to trust how things “seem” in this kind of things.

But this is just a place for get some hint and put them in one’s own reasoning, I think.

ps: the other possible weak point is the random generator: be sure it’s a real source of entropy and not a mere pseudo-random number generator.

pps: I can add a third point in favour of your way “as it is”: in real world situations is not likely that you ever meet an attack so tailored to your way of choosing passwords, so all of this is mostly theoretical

Anderer Gregor March 3, 2014 10:05 AM

So we are to believe that people who have, until now, considered passwords like “johnson74”, “letmein”, “sw0rdf1sh”, “p4ssword” and the like to be secure enough, will not instead start to chose phrases like “to be or not to be”, “o say can you see”, “let there be light”, “the oscar goes to” and everything else a simple Wikiquotes/Bible/… dump will recover, for their super-duper-uncrackably-secure passwords now?

Wilson March 3, 2014 10:07 AM

@Kahomono:

I really don’t know, that is what I feel as the weakest point of your procedure.

Your two points seems good to me, but I’m not inclined to trust how things “seem” in this kind of things.

But this is just a place for get some hint and put them in one’s own reasoning, I think.

ps: the other possible weak point is the random generator: be sure it’s a real source of entropy and not a mere pseudo-random number generator.

pps: I can add a third point in favour of your way “as it is”: in real world situations is not likely that you ever meet an attack so tailored to your way of choosing passwords, so all of this is mostly theoretical

Clive Robinson March 3, 2014 10:09 AM

@ Bruce,

    Something like “This little piggy went to market” might become “tlpWENT2m”. That nine-character password won’t be in anyone’s dictionary.

Always brings a wry smile to my face because it’s inadvertantly wrong…

You forgot that “Three letter acronyms” are included in many password dictionaries as valid words. It just so happens “tlp” is a fairly popular TLA with well over seventy recognised definitions you can easily find on the web.

Thus by accident your sentance became of the format,

Which are currently quickly found, as other parts of your article make clear.

Password advice is always difficult to get right 😉

Especialy when there is much foolish insistance –often with harsh penalties if not followed– on an individual “memorising it” and changing it frequently.

Asside from electronic aids in the form of tokens etc if passwords have to be used a “two part” system can be used. That is a long random string the user can write down and a short memorisable appendage/insert. But the appendage/insert realy ought not be a word or acronym in common usage.

Clive Robinson March 3, 2014 10:13 AM

@ Bruce,

    Something like “This little piggy went to market” might become “tlpWENT2m”. That nine-character password won’t be in anyone’s dictionary.

Always brings a wry smile to my face because it’s inadvertantly wrong…

You forgot that “Three letter acronyms” are included in many password dictionaries as valid words. It just so happens “tlp” is a fairly popular TLA with well over seventy recognised definitions you can easily find on the web.

Thus by accident your sentance became of the format,

Which are currently quickly found, as other parts of your article make clear.

Password advice is always difficult to get right 😉

Especialy when there is much foolish insistance –often with harsh penalties if not followed– on an individual “memorising it” and changing it frequently.

Asside from electronic aids in the form of tokens etc if passwords have to be used a “two part” system can be used. That is a long random string the user can write down –and keep on them– and a short memorisable appendage/insert. But the appendage/insert realy ought not be a word or acronym in common usage.

M March 3, 2014 10:17 AM

I use this idea:

same salt + random salt + very memorable:

For the forum at schneier.org, I would use

same salt + random salt + schneier.org

For the same salt, pick something high entropy that you can memorize,
like “CFJSNCgY”. Random string with the characters you can easily type everywhere. (These days, remember what you can type on cell phone keyboards…)

For the random, you need something that you can vary between sites. Say you pick a colour or a shape.

The third one is the name of the site or service.

The combination of the last two should be easy to remember. The first part you memorized and give high entropy to any password crackers (48 bits from my base64-encoded random string). The second part is to – in case someone actually gets your raw password – stop a changed from easily being useable on other sites if someone manages to sniff or get the unencrypted password.

Any particular drawback with this method?

marcmagus March 3, 2014 10:23 AM

I am surprised: you appear to be under the misapprehension that the “XKCD scheme” is “string four words together” (and hope that password checkers don’t try that sort of password). The actual scheme is “string four words chosen randomly from a dictionary together” (and hope that produces sufficient entropy to take too long to guess).

If the latter claim isn’t correct, I’d expect you to provide (a link to) the math disproving it. Note that your second link supports Munroe’s math that the method is strong.

If you’re just arguing against the former, it would be good to be more clear about that; right now you’re creating confusion.

John March 3, 2014 10:41 AM

The problem, I fear, with the Schneier approach, is that the first letters of english words are not randomly distributed across the alphabet. There are some letters that are much more common. If you look at the last sentence I wrote, for instance:

“There are some letters that are much more common.”
tasltammc

T and M are favored. Your entropy gets reduced. Or this one from the article:


WIw7,mstmsritt… = When I was seven, my sister threw my stuffed rabbit in the toilet.

Again, T is highly favored, as are M, S, and W. Most people will likely use a capital letter for the first or last letter, but none in between.

It’s better than what people are doing now, but we’re still human…

Sam March 3, 2014 10:49 AM

I have been using Password Safe for years now and make my password as long and complex as the site will allow, and unique to the site. Where possible I also use a unique username for each site. I do change my passwords about once a year, as well as the password to access Password Safe itself. It pays to back the database up to a CD once in a while too, which I keep in a real safe so it doesn’t get lost. The real challenge is picking the Password Safe password and that gets written down, it is never recorded electronically. I’ll have to reevaluate changing my passwords once a year, but if you use something like TurboTax to pull in financial info then you’re giving TurboTax that password, so after you pull the info you want to change the password.

Ron Helwig March 3, 2014 10:50 AM

It would amaze me that somebody could come up with a “security” scheme that basically eliminates any security, except that I’m sure it was thought up by a government bureaucrat 🙂

For those incredibly stupid “security” questions, I always give an answer that is incorrect. I record the Q&A in my password manager, and I use different answers for each site. Examples (that aren’t ones I really use):

Hometown: Orangutan
Favorite food: periwinkle
High School: Mitochondrial paranoia

bitmonger March 3, 2014 10:53 AM

@Jim
The dictionary has always been known with diceware and that doesn’t matter when used properly.

Here is the trick. The real problem is often sites prevent users from having a non-brute-force password because of support costs.

Diceware give you about 8k word or 2^13 bits/words and on average somewhere around 4.1 chars per word. s/key has about 2k words or 2^11 bits per word (3.7 chars per word). Dictionaries should be carefully constructed to eliminate multiple ways to create the same password. So that would be: bitsperword^numberofwords

Now, this gives a density of about ~3.2 bits per character for diceware or ~3 for s/key.

So an 88 bit s-key password would be about 30 characters.
and a diceware might be 29 characters.

Now many sites think 12 or even 8 is an acceptable max length. It’s not. This is not a diceware problem, however.

To attack the schneier construction. We estimate the entropy for the first letter of a previous word given the previous first letter of the previous word. Normal letters have only about 1.2 bits or so of entropy. First letters are higher in entropy, but no where near 3 bits per character of s/key. In addition, this entropy measure is very suspect and it is an average figure.

Base64 allows a density of 6 bits per character, but these are so user-unfriendly to be untenable.

Ignoring the usability problem, Base85 is even denser with 32-bits per five characters which allows a secure 15 char password. Ironically even this rarely works. 15 characters is sometimes too long for the system and this presumes no case folding and no characters in b85 are prohibited by the system.

All in all it sucks and it is a failure on the part of security people to require the right kind of password policies.

Barney March 3, 2014 11:09 AM

As other people have said, the XKCD method still seems reasonable. XKCD suggested it would take 550 years to crack at 1000 guesses per second. If as you suggest a cracker can try 8 million guesses per second – which suggests the site used a week password hashing function – then it should still take 25 days to crack.

25 days is probably enough to put off anyone who isn’t targeting you specifically, but if not adding a fifth random word should increase the cracking time to about a century, assuming the cracking system doesn’t speed up.

dhasenan March 3, 2014 11:25 AM

@Wilson: you reseed the PRNG every time you load the spreadsheet, presumably. Your attacker, having cracked one of your passwords, must guess the seed used for that password from a sample of four values — but those values aren’t from the PRNG directly; they’re modulused or multiplied by an unknown value.

Having done that, the attacker must find another password generated from the same spreadsheet, unchanged, where the two passwords were generated without reloading the spreadsheet.

I suppose if the PRNG were really terrible, the attacker could cut down on the initial search space, but it would have to be really terrible to have an appreciable effect.

Kent Borg March 3, 2014 11:27 AM

There is a difference between a password (something checked by a limited gatekeeper) and an encryption key (something that can be attacked in parallel).

Passwords do not need to be so secure. Think of the ATM PIN. Sure, attackers might break into a web site and steal hashes, but if they have already broken into the protected web site, who cares if they can crack the hash? (Those who used that password elsewhere, that’s who cares, and they are fools to reuse passwords. Sorry I just called nearly everyone fools.)

• Don’t recycle passwords between sites.
• Design whatever password format you like, but use real random data to choose HOW it is filed out. (Doesn’t matter whether it seems random enough to you, use something that is random.) Diceware or xkcd method are great!
• Don’t sweat passwords being wildly complex, put that efforts into encryption keys — they are what is hard to manage.

-kb

Bob S. March 3, 2014 11:48 AM

Re: “guessing strategy quickly breaks about two-thirds of all passwords”

All we talking unlimited guessing here…10,000, 100,000 tries?

Seems to me if password routines were limited to x tries (10?) with the wait period doubling every try (5-10-20 -40 secs, etc) that would just about take care of the problem.

bitmonger March 3, 2014 11:56 AM

@barney

Yeah, my calculations are the same ~25 days for 44 bits.

Interestingly, Randal’s example dictionary here appears to be small (2k) and use big words (6.5 chars on average) which leads to a density of 1.7 bits per char. It might be a better dictionary for remembering passwords and choosing a custom dictionary prevents brute force with a known dictionary, but any of these approaches in my opinion are good (s/key,diceware, or custom dict) but they make different trade offs at different desired strengths.

All in all Randal’s advice still looks good.

Schneier’s method might be comparable in density to Randal’s dictionary on average, but I think assuming min-entropy = guessing-entropy is a bad idea. Some passwords made that way will be weak as a result. I don’t think this is a terrible recommendation, but I think many common parameters used with the random word method would result in stronger passwords.

Anura March 3, 2014 12:07 PM

Part of the problem is that password policies get too restrictive. I worked for a bank with a password policy of 8 characters exactly, one upper, one lower, one number. This has the effect of reducing the number of possible passwords, and the more you reduce the number of possible passwords, the easier it gets for attackers. I figure most passwords are in the form “[A-Z][a-z]{7}\d” I like to use a nonsensical passphrase with proper capitalization and punctuation (often 30-50 characters), in this case I picked 8 word passphrases and used your method, but it probably still ended up being easily brute forced.

There are really four areas to look at:

Password Storage
Password Policies
Password Education
Alternative Authentication Methods

Unforutnately, while education helps people that care, too many people are likely to just ignore it to pick something easy to remember. The other 3 you have to rely on software and service providers to fix, which does appear to be improving in terms of storage and multi-factor authentication.

Password policies are problematic, as a stricter password policy results in fewer possible passwords. I was playing with the RockYou database a while back, applying password polciies and seeing how many passwords I would eliminate, as well as measuring the ratio of distinct passwords to total passwords (my, albeit limited, measure of the strength of the password policy), and the policy I came up with is the following:

Each password must be at least 8 characters, and have 3 of the 5 following properties:

Contains a Uppercase Character
Contains a Lowercase Character
Contains a Number
Contains a Non-Alphanumeric Character
A length of 14 Characters or Longer

The basic idea being that you basically accept that you can’t force users to choose strong passwords, but at least eliminate the really weak passwords, allow those people who would choose stronger passwords to do so, and increase the possibilities to slow down an attacker who tries to crack your entire database. Combining with a salted, slow hash function, possibly with a secret key* should also be considered essential.

*you have a chance that they will recover the database and not the key, but you should not rely on it. See the Adobe leak as an example for both cases.

Steven A March 3, 2014 12:12 PM

Password crackers can try a lot more than 8 million guesses per second for most formats (e.g. MD5, SHA, NT Hash, LANMAN). I don’t know where the 8M in that article came from, but the screenshot shows four GPUs collectively trying 656M+ guesses per second.

A good rig can make billions of guesses per second for the formats I mentioned. A couple of years ago, Jeremi Gosney built a cluster that could try over 300 billion guesses per second for the NTLM (MD4) hash.

So, the 8M figure is off by a factor of 1,000 or more. I make this point because it’s important for people to know that all of the old recommendations regarding length are way off the mark where offline attacks are concerned. Even a truly-random 8 or 10 character password is woefully insufficient unless the system in question is using something like bcrypt, scrypt or PBKDF2.

martinr March 3, 2014 12:26 PM

You’re misunderstanding the xkcd password rules. The suggestion is pretty smart, really. The average “dicitionary” size from which a person draws to perform daily conversations is around 2048 words (using power-of-2 for simplifying the math). The comprehension vocuabulary is usually larger, but the xkcd estimate is really about the vocuabulary that you will choose from when producing output, not that when parsing input.

pulling 4 random words from a 2048=2^11 vocabulary and concatenating them gives you 44 bits of entropy — which is pretty high for a password, really. If you want more, add a 5th random word and you get 55 bits. Most schemes that Joe Random User employs to create a 6-8 character password will produces less entropy than that.

The human is extremely bad at memorizing small amounts of random data in a perfect fashion, whereas it is quite good at memorizing complex data in a less-than-perfect fashion. Memorizing 6 random ASCII digits from a 128 bit alphabet is extremely difficult for humans, but memorizing 4 random words out of ones own active vocabulary is trivial.

Mike Stein March 3, 2014 12:36 PM

@Bob S –

Bruce is talking about the situation where the bad guys have stolen the file of password hashes, and are running code on their local machine to find passwords that generate a matching hash for an entry in the stolen file. This is all happening in an environment completely owned and controlled by the bad guys – the site from which the hash file was stolen is not involved and so can’t slow down their cracking speed.

N March 3, 2014 12:43 PM

I’m very disappointed that Bruce doesn’t seem to understand how the XKCD system works. The key is that the user IS NOT allowed to pick the words himself.

The computer picks the words completely randomly. The words are just mnemonic indexes into a dictionary (which the attacker can have). As mentioned in other comments, 4 truly random words from a list of 2048 would be something like 44 bits of entropy.

In fact, in an optimal implementation, you would only allow the user to skip a suggested password 2 or 3 times. This is to prevent the user from passing up “harder” words until he gets a string of four “easy” words, which essentially reduces the randomness.

I picture XKCD passwords being integrated into account signup and and profile management screens, as opposed to something the user is supposed to do himself, manually on the side (where he’s prone to make an error). My ideal UI would simply show the user the suggested password with two buttons: [USE THIS ONE] and [PICK ANOTHER]. Pick another would only generate 2 other random passwords (for 3 total) before cycling back around to the first suggestion.

-N.

34kjnfk3jfn3kjn March 3, 2014 1:05 PM

Most modern attacks use side channel methods like poor recovery policy or backdoor sniffing. Making complex policy around human memorization is only sandbagging usability and not actually fixing anything.. You also have to have intelligent architecture, look how fast two-factor was defeated with automated attacks because it’s so easy to get privilege and inject threads into other processes on x86..

Even in a case where the attacker has a database with MD5 salted passwords, they are not going to brute force passwords, they are going to get the salt from a configuration table or script and inject privileged users and leverage inner communication or launch attacks on traffic using social engineering or exploits..

I find it hilarious people are still brainstorming password policies almost 4 decades later. At what point is it obvious that it isn’t the problem to these geniuses?

Carl 'SAI' Mitchell March 3, 2014 1:28 PM

No, bruce gets the XKCD system. The XKCD system is terrible. Diceware is good. Both produce passphrases, but in the XKCD system the user picks the words and tries to make them “random”. The user will fail, if the user is human. With Diceware, the words are chosen randomly. Diceware has a guaranteed information entropy, so you can get a good security estimate. The XKCD method is better than a password, and is equivalent to the Schneier method (think up a personal phrase, and use the first letters vs think up a personal phrase and use the whole phrase.) Neither is as good as a random password, and a random passphrase is easier to remember than a random alphanumeric string.

Adrian March 3, 2014 1:33 PM

This is why the oft-cited XKCD scheme for generating passwords — string together individual words like ‘correcthorsebatterystaple’ — is no longer good advice. The password crackers are on to this trick.

I’m surprised to see you saying this, because it isn’t necessarily true. The XKCD method is basically the same as the diceware method: a set of words constitutes the alphabet, and a random combination of such words constitutes the password. Possible cominations for such a password are d^w, where w is the number of words in the passphrase and d is the number of words in the dictionary.

Whether or not that’s a good password depends, of course, upon the size of your dictionary and the number of words chosen.

wumpus March 3, 2014 2:08 PM

@Fabio “echo -n myuniquepassword#schneier.com|sha1sum”

This really only makes sense if you aren’t willing to use your phone as a password wallet (or carry a USB key/paper with salts with you) and need to log into various computers you trust with youruniquepassword.

Otherwise, you will want a salt to avoid the issue where losing your password on one site (how many store passwords, anyway? Way too many from what I’ve seen) means making it easy to break it on every site. I’ve fallen into that trap myself.

http://forums.xkcd.com/viewtopic.php?f=12&t=88888#p3105718

Alan Kaminsky March 3, 2014 2:18 PM

Everyone’s been arguing about whether the XKCD password picking method is secure, or not. The confusion apparently arises from how folks interpret the method.

The original comic says merely “four random common words”, with an example of “correct horse battery staple”.

Now does “four random common words” mean “four common words chosen randomly by the user“, or “four common words from a known 2048-word dictionary chosen uniformly at random using a true random number source“?

If you interpret it the first way, then the XKCD method is not that great, because humans are terrible at generating items randomly.

If you interpret it the second way, then the XKCD method does in fact give you a password with N times 11 bits of entropy (N = number of words).

wumpus March 3, 2014 2:25 PM

On the goodness of the xkcd system.

I find it highly unlikely that anyone who could follow the math on why you should use “correct horse battery staple” would find it has less than 44 bits of entropy (people who heard about it second hand are likely the ones Bruce is tired of dealing with). The two ways to increase it would be to either use more words or use words that aren’t in the top 1000 most common words (hint: if Randal could use it on the “uplifter five” comic, don’t use it in your password). Obviously, using either an automatic generator or an actual “dead tree” dictionary and randomly jabbing a word with a pencil will work better, but only in getting ~6 bits of entropy per word.

As far as picking letters from a phrase, that is only marginally better than picking obscure words. A simple webscrape will find them, and I’m sure using google completion will give you a huge list of well used phrases. Your phrase will be in there. No it really will, just like any words will be in existing dictionaries. Obfuscating the phrase is roughly as useful as obfuscating uncommon words. The best advantage I can think of is that you will get a slightly higher level of entropy in places that have artificially low limits on passwords (a good sign that security is so low that you needn’t bother with strong passwords anyway). Otherwise there is no reason to believe that this will give you close to the strength of a few uncorrelated words.

Steve Gibson March 3, 2014 3:47 PM

Passwords are obsolete. The new SQRL authentication system is near bullet-proof.
Just google SQRL, it uses QR codes and no secret is revealed 🙂

Erica March 3, 2014 3:48 PM

The XKCD (and other) approaches can be made much safer by separating parts of the password with punctuation.

correct,horse_battery-staple/

(I’ve used [Comma Dash Hyphen Slash] in the example which is easy to remember as it is the suit order in Bridge [Club Diamonds Hearts Spades]).

chris March 3, 2014 3:55 PM

All of the above comments are leading me back toward the idea that making password cracking expensive on the server side is in the long run more effective than trying to create ever more passwords to remember with ever higher entropy.

If each password is stored with a unique salt and the password hashing function is iterated some large number of times (which can also be stored with the password record) such that it takes something on the order of a second to compute the stored hash of password+salt, then it’s barely noticeable to the user, but if someone steals the password plus salt file they essentially have to brute force every single password. As computers get faster, the server just adds to the number of iterations required to get to the stored hash such that even fast computer can only do a small number per second.

They’ll still get the easy ones, but it will take them longer – they have to compute them for every password in the file, and at a very limited rate (a few per second per machine working on it). Anything more complicated than the common ones is then reasonably secure.

I’m pretty sure I read all of the above description some time in the past right here…

Clive Robinson March 3, 2014 3:56 PM

@ ALL

In my post above you will see,

    Thus by accident your sentance became of the format,

Which should have a format string on the next line but does not…

The missing format string should be,

{short word}{short word}{short appendage}

@ Moderator / Bruce,

I had a lot of problems with “Preview” and “Submit” but not getting pages this morning (your time) that led to some double posts. I notice from looking through the “last 100 comments” page I appear not to be the only person was there some odd problem at the server?

GregW March 3, 2014 4:06 PM

I think there’s a general intimidation factor associated with remembering “hard-to-remember passwords” that is overstated. There’s more fear that one will forget passwords than skill required to remember them (if one cares) and actual risk.

Here’s whats led me to that assessment.

Due to password expiration policies at work, I end up needing to create a new password every 6-8 weeks.

So I do, and out of an interest in security, I’ve tried to make it as secure as reasonably possible. I create random 9-10 character base85ish passwords based on some offline random number sources that I mostly trust. I write it down and carry the paper with me for a few days.

Because I have to login a few times a day on average across the various systems (not quite SSO here!), within a few days, at most a week, I have memorized the password no matter how weird it is. I don’t discard the paper until I have returned to work from at least one full weekend away and clearly still remember it.

My memory in general is worse others I know professionally and personally. But I consider passwords a lifetime security skill so I just do it, and it’s easier than I would have guessed before I started doing it.

(I’ve never bought into the password vault approach advocated by Bruce because they just seem like the first thing worth hacking/monitoring if my local system is penetrated, and then the attacker get all passwords for all sites even rarely visited ones; at least with a keylogger they only get the passwords to sites you login with post-infection. But I do reconsider that view every once in a while.)

That said, it is tricky to manage a dozen passwords with the above scheme; I haven’t gone that far and keep reconsidering PasswordSafe for that reason. I only use my above-described scheme fully for my main work password (and I have a secondary password for work which doesn’t change much which I use for third-party sites I access for work). For all those personal passwords, the less you use the site/credentials, the trickier it is to ever memorize the random-generated ones.

I wonder what mathematical formula might describe the tradeoff between the (Kolmogorov?) complexity of passwords one chooses versus the number of passwords one has to keep track of? Is it linear? I’m not sure I’ve seen that sort of analysis (and how about an equation that gives you the forgetfullness-rate given each of the above variables?) Clearly some coefficients might vary from person-to-person, but I’d imagine there’s something that is true for people as-a-group.

Figureitout March 3, 2014 4:10 PM

Clive Robinson RE: Server
–Right after you posted last night the server was down for maintenance; that could potentially mean you know what…I’ve never seen it down for maintenance before.

the paul March 3, 2014 4:12 PM

No, bruce gets the XKCD system. The XKCD system is terrible. Diceware is good. Both produce passphrases, but in the XKCD system the user picks the words and tries to make them “random”.

This is not correct.

Although the comic itself doesn’t say explicitly that a high-quality entropy source must be used to make the approach effective, it is extremely likely that Randall intended it to be understood that way. He has a strong background in math and statistics, and does not often make simplistic errors of that sort in published work.

It seems silly to interpret the ambiguous phrase “four common random words” in a way that makes the rest of the comic blatantly wrong, when the other, more semantically precise interpretation makes the point entirely valid.

Anura March 3, 2014 4:16 PM

@Clive

While I think passphrases are superior, unforutnately you have the problem of applications limiting the password length. If the password length is 20 characters you have to settle for something like “ItotTptaomyhn30ybAd” instead of “It turns out that Tesla patented the act of moving your hand nearly 30 years before Apple did.” – the latter being significantly more difficult to guess, but the former still being stronger than “robotics gastronomy”.

Russ March 3, 2014 4:23 PM

Blatant plug warning: after my wife and her friend BOTH got their email accounts hijacked within months of each other, due to using the same (or variations of a single) password everywhere.. I decided to come up with an offline password generator/recall device for non-techies like them. Electronic password safes are great, but many people just don’t want to bother with them.

I made a few batches of rings with different permutations of [letter – number/symbol] pairs. No batteries required, and just like electronic systems one doesn’t have to remember all of one’s distinct passwords; just a single method to dial-in a unique password for each website, company, etc. I find having them on my finger is very convenient and no longer need to worry about remembering any passwords.

One can read off as many rows as required to generate very long, gibberish passwords combined with some secret word (usually a long-dead pet’s name) to end up with lower/uppercase, numbers and symbols satisfying most fussy password policies.

http://russtopialabs.bigcartel.com/

MingoV March 3, 2014 4:42 PM

” He can try guesses as fast as his computer will process them…”

How? If it’s an internet web site password, the user ID and the guessed password have to be entered into log-in fields. It takes at least a tenth of a second for this to happen. Same with the second guess and the third guess. Then the server blocks access, sends an e-mail notice to the user of multiple wrong passwords, and blocks new log-in attempts. Password cracking doesn’t work unless you hack the server first.

If someone’s trying to hack into a secure disk volume on my computer, it’s the same problem. Enter user name and guessed password; get rejected. Do this two or three more times and you’re locked out. You may be able to remove the lock-out by logging off, but you’ll probably have to restart. Which means you need the startup password. And you only get three guesses. To get around the three-try limit, you would have to hack the OS. Even with the hack, testing passwords will take orders of magnitude more time then generating passwords.

Anura March 3, 2014 4:50 PM

@MingoV

You didn’t read the whole paragraph.

The general attack model is what’s known as an offline password-guessing attack. In this scenario, the attacker gets a file of encrypted passwords from somewhere people want to authenticate to. His goal is to turn that encrypted file into unencrypted passwords he can use to authenticate himself. He does this by guessing passwords, and then seeing if they’re correct. He can try guesses as fast as his computer will process them — and he can parallelize the attack — and gets immediate confirmation if he guesses correctly. Yes, there are ways to foil this attack, and that’s why we can still have four-digit PINs on ATM cards, but it’s the correct model for breaking passwords.

bitmonger March 3, 2014 4:58 PM

If anyone’s interested the alt text on the comic reads:

“To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.”

This comic has made conversations I’ve had with people about randomly generated passwords and password policies easier. People don’t trust the math sometimes. Sadly, now I think I’ll now also hear from people “that xkcd way was debunked by Schneier”.

Over what appears to be a misunderstanding.

Oh no! March 3, 2014 5:14 PM

Does this mean I have to change

chrisinglismasturbatestestothestolenyahoovideochatsofunderagedgirls?

memory.act March 3, 2014 5:32 PM

@GregW “My memory in general is worse others I know professionally and personally.”

I fall into the same category in terms of memory skills, but have improved with practice.

I memorize a few key commodity prices each day for practice.

After a year of doing this, it has become easy to memorize up to 17 digits in a few minutes.

Anura March 3, 2014 5:49 PM

This whole thing reminds me to finish writing my paper on a memory-bound password-based key derivation function to try and get published/peer reviewed. The idea is to create a set of principles that algorithms should follow to minimize the possibility of issues, which I don’t think really exists today:

All parameters must be hashed with every iteration, encoded unambiguously, and each iteration should be guaranteed to have a unique input to the hash function e.g. by hashing the counter (otherwise the risk of collisions or early determination that a password is (in)correct is possible).

The length of the password must be padded to a fixed length (provides protection against DOS, Timing attacks)

Memory access should always be predictable without knowing any secrets (prevents timing attacks, which is an issue with scrypt)

Both execution time and memory consumption should be as configurable as possible (using even 16 kB of ram can kill GPU performance for attackers, while generally keeping in L1 cache on CPUs for the server).

The algorithm should be as simple as possible, making cryptanalysis as easy as possible.

Ryan March 3, 2014 5:59 PM

My method is two choose 8 dictionary works and mash them together to form 4 words unlikely to be found in any kind of dictionary. Secure? I’m no expert but from what I can gather I am exceeding 128 bit (30+ characters) in all cases.

Example:
sparkling features highest excavation carriage phoney property annotation

lingfeatu ghecavat rriagney pertynotat

Typing this at least 10-20x will help me remember it without having to make a record.

Steve Witham March 3, 2014 7:14 PM

Bruce misunderstanding Randall’s scheme, and proposing a much less easy to believe in scheme, is sort of shocking coming from Bruce Schneier.

Almost as if Bruce’s account has been broken into… or as if Bruce were being coerced….

I don’t really think so, but the point is this: even heroes are fallible. If you can’t follow the discussion in the comments on this post, then don’t follow any suggestion you happen to read in one blog post, even if it is by Bruce Schneier. Get a second opinion or better yet, learn enough math to follow arguments like these.

Lawrence D’Oliveiro March 3, 2014 7:45 PM

Just a note that password length is more important than choice of allowable characters. To make that concrete:

Choice of 8 characters, uppercase letters only, gives you about 10 ** 11 possible passwords.

Choice of 8 characters from all 95 printable ASCII characters gives you over 10 ** 15 possible passwords.

But choice of 12 characters, uppercase letters only, gives you over 10 ** 16 possible passwords. That’s an order of magnitude improvement over just increasing the number of allowable characters.

I personally think that being able to use upper and lower case letters and digits is sufficient.

Harry Johnston March 3, 2014 8:05 PM

The XKCD scheme assumes an online password-guessing attack where the guessing rate is limited. It was never meant to defend against cracking.

(A better criticism is that it doesn’t scale; you can easily remember one XKCD password, maybe a few, but not dozens. And if you don’t have a separate password for each site, cracking becomes an issue.)

Moderator March 3, 2014 8:55 PM

Clive,

The site moved to a new server last night, and then the new server was being very slow to process new comments, which could have led to timeouts. It should be much better now (but we’ll see if I still think so after I click submit on this comment).

Tariq March 3, 2014 9:07 PM

I’m surprised. No one, either Schneier or anyone on this thread, has mentioned Steve Gibson’s method of generating passwords:

https://www.grc.com/haystack.htm

This, combined with @Srix’s suggestion of using one’s multilingualism finally to one’s advantage, one could come up with something like:

kyaaa~~~ Saya suka Sailor Moon! :3

(“kyaaa~~~ I love Sailor Moon! :3”)

Bonus points if you’re a burly 50-year old man who’s more a Tolkien fan anyway. Not that there’s anything wrong with liking shonen.

jim March 3, 2014 9:13 PM

Why not combine a simple phrase in multiple languages? Pick a phrase, then pick 3 languages (french, spanish, latin, klingon, etc.) Only you know the phrase, and the correct sequence of the languages. For example: “drink beer beber cerveza Bier trinken”

A dictionary attack would have to include all the right dictionaries in the right sequence. One weakness in all these attacks are they are single language dependent.

Clive Robinson March 4, 2014 1:33 AM

@ Lawrence D’Oliveiro,

    Just a note that password length is more important than choice of allowable characters.

If people identify what a “character” is correctly in their system.

I prefer to use “symbol from a set” rather than other terms such as “character from an alphabet” as it helps stop peoples preconcieved notions getting in the way when trying to explain about various password issues and various schemes.

To see why think about this statment from a laypersons perspective,

    The scheme uses an alphabet of eight characters which are encoded into dictionary words of six alpha characters, four are randomly selected then encoded in UTF-32 in network order for a password transmission length of ninety six bytes

And ask them if it was strong or weak?

As you and I know “if the enemy knows the system” it’s very weak at 12bits of entropy, but others might conclude it’s very strong at 768bits of length.

Now I would be the first to admit it’s a contrived example to demonstrate a point, but I’ve seen similar technical descriptions used for real in technical requirments specifications…

If

Figureitout March 4, 2014 1:41 AM

Clive Robinson
If….
–You need to stop using schneier.com as you botnet testing site b/c weird things keep happening to your comments lol.

Clive Robinson March 4, 2014 1:42 AM

@ Moderator,

I did not see a mushroom cloud rising over the horizon from your direction so I’m assuming some improvment was seen.

However on a single post this morning (my end) it was about three to four times slower than in the recent past. However a single post is not a reliable sample size to differentiate server delay from accumulated network delay from a mobile trundeling through London rush hour 🙁

Moderator March 4, 2014 3:26 AM

What kind of assurances can you offer “schneier.com” readers about the new servers?

That TLAs will pwn it if they want to?

(Now, let’s see how long this comment takes to post….)

bob March 4, 2014 5:48 AM

On the xkcd theme, on my Mac (should work on most BSD and linux) I build a memorable phrase from:

awk ‘length($1)<9 { if (rand()<=0.0001) print toupper($0) }’ /usr/share/dict/words

That’s a list of 88698 words. As I understand it, a dictionary containing every possible combination would contain 88698 ^ 88698 ^ 88698 ^ 88698 ^ 88698 lines.

Paul-Kenji Cahier Furuya March 4, 2014 6:05 AM

repeat 4 do grep -i ‘^\w{3,8}$’ /usr/share/dict/british-english|sed -n $((RANDOM%19197))p|tr ‘\n’ ‘ ‘; done

How’s that weak? I can not wrap my head around any way 4 random words would be easily bruteforced.

Alan March 4, 2014 6:36 AM

I was going to write:

I’m not sure I follow why you say, “This is why the oft-cited XKCD scheme for generating passwords — string together individual words like “correcthorsebatterystaple” — is no longer good advice. The password crackers are on to this trick.”

I can readily believe that picking four word phrases out of anything written will be picked up by the crackers. But four random words?

However, after trying to figure out the math, (my ignorance is showing), I don’t think entropy has anything to do with the weakness of an “XKCD” password unless it has mixed case and non-alpha characters added in.

If a simple 5000-word dictionary of all lower case words is used, then there are n!/(n-k)! permutations, where k is the number of words in the phrase. With four random words in a passphrase, there are 6.24×10^14 permutations. At 8 million guesses per second, that’s 903 days to exhaust all the permutations. Best guess would be half that time.

At 656 million guesses per second, it is only 110 days. That’s ok for general use, but not for strong encryption. And at 300 billion guesses per second, that’s less than one hour. So Bruce is right (surprise!), just four random words without adding entropy is not strong enough.

Once you add a few uppercase and one or two non-alpha into the random four words, the permutations sky rocket. I’m not sure how to calculate them, but I think we’re back to entropy calculations and on the order of decades for brute-force guessing.

I would appreciate it if someone with stronger math could verify that.

Joe March 4, 2014 6:58 AM

In the above comments, I only see two suggesting that offline password cracking isn’t the problem. The problem appears to be availability of encrypted password files.

Also, how does the offline method know when it has cracked a password – doesn’t the hacker still have to try logging in with every guess (attempts which will be locked out)?

Paul-Kenji Cahier Furuya March 4, 2014 8:52 AM

Alan: you are wrong in assuming a 5k word dictionary though.
It’s closer to a 20k word dictionary on the smallest case.

That’s 1.6e17 possibilities for 4 words. Even at 300e9 per second, that’s 1 week.

This is as likely as cracking a 9 random lower-ascii character password.

Figureitout March 4, 2014 9:17 AM

Moderator
That TLAs will pwn it if they want to?
–Ok, makes me feel safe. 🙁

CleverBoy March 4, 2014 9:20 AM

Wouldn’t going with the actual sentence be as (if not more) effective and more user friendly at the same time?

In other words – make “Wow, does that couch smell terrible.” your passphrase. It’s easier to type than “Wow…doestcst” because it reflects the way you naturally type the rest of the day. I’ve been coaching my organization to make passwords this way (enforcing length and complexity rules) and it has been well received.

Daniel Taylor March 4, 2014 9:22 AM

The compressed phrase is only necessary when dealing with systems where you can’t use the unmodified phrase.

“This would make a decent password on most modern systems. Catch phrase!”

would be better than “Twmadpoms.Cp!” and provides more opportunities to introduce random entropy in a way that’s easy for the user to remember.

Grey March 4, 2014 11:24 AM

I’ve never heard a rational explanation — or an explanation of any kind — for the assertion that one should ‘change your password regularly’ (assuming of course the extant password is a strong one). Thanks for noting that in Rule #2.

I am listening, however, so if someone has a good reason for doing it, please reply to this comment.

Anura March 4, 2014 11:39 AM

@Grey

The main reason to change passwords is so that if someone does steal your password, and you don’t know about it, then it limits the amount of time they can use it. Whether it is effective is another question, as it can lead to someone choosing weaker passwords. Then again, if you have the idiot who goes with “Password1” for a sensitive system then I’d imagine it could only get better from there (like Password2 or Password3 or even Password4!).

Somebody March 4, 2014 12:47 PM

On changing passwords

Never changing a password is trying to keep secrets from the future.

Once a hashed password has leaked Eve can store it and revisit it in 5 or 10 or 20 years, with the benefit an additional 5 or 10 or 20 years of Moore’s law, math and psychology.

Many passwords are ephemeral but there will be a few that can still do damage after many years if not changed. The encryption key for a password safe is a particular weak spot, since a break is harmful if any of the passwords it protects is still valid. You need to change not only the master password, but every password that was ever stored in the password safe that could still cause harm.

I won’t argue if you want to call changing (or invalidating) passwords every two to five years “occasionally changing passwords” instead of “regularly changing passwords”.

Brian M. March 4, 2014 1:28 PM

What all of this password problem represents is the security of the host system storing the password or its hash.

Adobe got hacked and somebody copied off their weakly guarded plain text passwords. It doesn’t matter how clever you password is if it’s stored poorly on the host!!

Another thing that hasn’t been mentioned here are rainbow tables. Look at CrackStation, put something into a hash, and try it there. Is that password safe when there are rainbow tables for strings of at least 16 characters? “tlpWENT2m” may not be in anybody’s language dictionary, but the hash for it is already there.

Long ago I used something similar to Bruce’s scheme. You know what makes it a pain? The long phrases necessary to construct a password that isn’t already in the rainbow tables.

The only thing that really makes any sense is multi-factor authentication by multiple devices. You want to log in? A message is sent separately to multiple phone numbers, i.e., land line, cell phone, and pager. Put those messages into a key fob, and its output into the application. But that’s just too much of a pain in the hind end to ask people to put up with it.

I replicate passwords on message boards where I don’t care if my password is nabbed. I use individual passwords for financial sites, etc.

But as far as a perfect password scheme goes, all schemes are worthless when the host stores it all in plain text.

Anura March 4, 2014 2:17 PM

@Brian M.

You are conflating multiple issues. Password storage is a problem, but you have to ignore that when coming up with a password. Yes, rainbow tables are effective, yes plaintext password storage is horrible, but choosing a strong password is good as a habit and it can protect you even if a relatively weak scheme is used. A good, never reused, original, preferably nonsense, passphrase is likely to protect you from sites using unsalted MD5 hashes. A site using salted bcrypt with a high cost is just a bonus.

Also, Adobe’s passwords were not plaintext, they were encrypted with reversable encryption in ECB mode, so while far from ideal, people with good passwords were safe (assuming they pick a password hint that could give it away, but that’s another matter entirely).

Brian M. March 4, 2014 2:54 PM

@Anura:

I truly wish there was a way to know how a site stores passwords before using the site. One of my credit card companies allows a maximum of eight characters for the password. Eight! And Cisco allows a maximum of 15 characters, and they must be alphanumeric.

Yes, I know Adobe didn’t store passwords in plaintext, but they were weakly stored. Kaspersky: 10 Worst Password Ideas (As Seen In The Adobe Hack). Fancy a crossword puzzle made out of those passwords?

Coming up with a “good” password only means staying a little bit ahead of the current generation of password crackers. Quantum computers cracking a password is irrelevant when ganged graphics cards are doing it now.

Anura March 4, 2014 3:23 PM

@Brian M

Even if you guess a trillion passwords per second to check against a single-hash, a five word diceware password will take about 5.5 months on average, assuming they are focusing on your password specifically AND know you are using a five word diceware password AND know what dictionary you are using. Compare this to your pets name and birth year (probably one of 250 names, and 60 years, giving you a massive 14 bits of security (high estimate).

Password storage methods can significantly slowdown password crackers. PBKDF2 with 1024 iterations adds the equivalent to 11 bits of security (HMAC calls the hash twice, log_2(1024*2) = 11), making it 2048 times as long, or 938 years (not adjusting for Moore’s law). We can do much better than that, I’d say 262,144 (what can I say, I like powers of 2) is more than tolerable of a delay for most systems while adding the equivalent of an extra 8 bits of security or 240k years to crack.

Use a function that is even somewhat memory bound (bcrypt applies), and then the GPUs advantage dwindles. When/if I finish my paper (and either get it published or give up), I’ll post my algorithm here, hopefully with benchmarks comparing it with single md5, sha1, sha256, sha384, sha512, PBKDF2, bcrypt, scrypt (I have to learn CUDA first).

So yes, we do need to improve password storage, but that doesn’t mean that you don’t gain a significant advantage from a strong password.

Clive Robinson March 4, 2014 5:26 PM

@ Somebody, Brian M, Anura,

Perhaps the second question [1] to ask about passwords is,

    Do they need to be kept on the server?

To which the answer is no…

However we still do and there is a problem you don’t hear talked about much and that’s “storage migration”.

Normally when you have a database even though encrypted you know the contents of the records thus upgrading the database is –oversimply– a mater of reading the records from the old DB and writing them to the new DB.

Not so with a password DB –or it shouldn’t be– because the encryption –in theory– should be “one way”, so whilst you can easily update the physical storage you can’t update the DB system to a more informaion secure system.

The solution to this problem is some what awkward and can be done in three basic ways,

1, Effectivly destroy the old account and force the user to start from scratch with a new account.

2, Lock the accounts and make them use the secret question or talk to tech support method of getting sent a new random password.

3, Grab the plaintext password the next time the user logs in using the old DB and write it to the new DB (you can do this transparently or by invoking a modified “change password” feature).

For large online systems option 3 is usually the way to go… BUT it has a hidden issue, how long do you keep the old DB up and running, a month?, a year?, indefinatly?

It’s because of this issue some people have chosen to “encrypt” rather than “one way” hash their password DBs and it’s bitten them due to the way they have done it. That is they have used a “single key” or symetric key system which has the advantage of allowing the use of crypto accelerators but the downside of having “the key” around all the time. However there is no reason if going the “encrypt” rather than “one way” way to not use PubKey encryption with the private key locked away in a physical safe somewhere.

But the “how long” problem has a secondary issue of having the same plaintext password stored on two “one way” systems the old “weak” one and the new –hopefully– stronger one. And this is where the “KeyMat” / EOL destruction problem raises it’s head to sink it’s teeth in to your softer parts.

Humans tend to make mistakes one of which is dealing with “old stuff” as some joke “It’s what lofts, sheds and garages are for”. The backups from the old DB will nolonger get updated on the “backup cycle” but almost certainly will get kept… There is thus a very real danger that as they approach EOL they will get downgraded from secure storage to “in a box somewhere” storage and even chucked out for “re-cycling” or some such. At which point it might like those second hand HDs security researchers like to publicise “get found” by some one who then uses it for some illicit purpose.

[1] The first being “Why are we still using the XXX things?” (Where XXX can be replaced with your favourit “frustration expression” 😉

Anura March 4, 2014 5:54 PM

@Clive Robinson

Option 4:

NewOneWayFunction = BetterOneWayFunction(ExtraSalt, OldOneWayFunction(OldSalt,Pass))

Note that this method reduces collision resistance by an insignificant amount.

Clive Robinson March 4, 2014 7:02 PM

@ Anura,

What you propose for option 4 is reasonable for a single upgrade… but, what do you do for the fifth or even tenth upgrade?

At some point you have to ditch the old otherwise the new becomes unmaintainable.

Which raises another software industry problem that again does not get talked about as much as it should –especialy in industrial control– and that’s “Planned obsolesence”, but well save that topic for another day.

Anura March 4, 2014 7:32 PM

@Clive Robinson

Option 4.5, combine 4, 3, and 2. When you upgrade, set everyone who has a current account to option 4, when they next login, use just the new hash. If they haven’t logged in during the last version, set their hash to a random value to make them go through the reset process if they ever try to log back in, and also to mess with anyone who steals the password database.

Kevin W. Wall March 4, 2014 8:36 PM

Bruce, your “Schneier scheme” is somewhat of a superset of simply picking a sufficiently long sentence and selecting mnemonic phrases by combining the first character of each word in your chosen sentence (and maybe mapping a few characters to numbers or symbols, as required by the specific password policy). Ross Anderson and his colleagues Jianxin Yan, Alan Blackwell, and Alasdair Grant wrote a paper about that method way back in 2000. (See their technical report “The memorability and security of passwords and some empirical results” for details.) They did some empirical measurements that showed that such passwords were as hard to crack as randomly constructed ones.

Buck March 4, 2014 11:04 PM

@Clive

Thank you so much for that!
Kinda sounds like one of those lessons that may have been most oft learned through experience…
You may have inadvertently ended up saving me significant headaches in the near-future (and not a moment too soon ;-)!
It’s now seeming pretty silly to me – the thought of “future proofing” encrypted hashes – by jumping up to say 1024, 2048, or even 4096-bits…
I know you’ve been beatin’ down Moore’s law lately, but what about its successor? Corporate/Nation State level resources feed into massive parallelization??

If I’m gonna have to rehash all of my users’ passwords when my encryption scheme has been publically broken, I’m gonna have a major security problem while recapturing legacy user’s secrets :-\

So rather than using a quick hash, it seems as though the longer the better… Assuming of course it’s not eventually proven more easily crippled than some of its quicker peers…

I also hear many talk about reapplying hash functions… Seems like without a significant salt, one would effectively be reducing their possible alphabet..? Are there any hash functions specifically designed for multiple iterations?

Buck March 4, 2014 11:08 PM

I suppose that’s probably why two-factor authentication is all the rage right now 😛 Sure beats a hell of a lot of phone calls from angry customers!

Nevermind the fake cellphone base stations and securid breach behind the curtain…

National Insecurity Agency March 4, 2014 11:13 PM

When you make containers manually with cryptsetup in Linux or when using KeePass jack up the iterations. Test them to see what your system can handle. The more there are, the harder to brute force.

On mine:
sudo cryptsetup –cipher aes-xts-plain64 –key-size 512 –hash sha512 –iter-time 5000 luksFormat /dev/loop1 (creates 145,000 iterations)

This takes about a full second to open on my system. I have Keepass set with 100,000 iterations.

azrielle March 5, 2014 6:50 AM

Assuming you can use symbols in your password, is there a way to insert alternate international characters using +0225(embedded) ” á “, for example, in the password?
Or ¿, etc.

azrielle March 5, 2014 6:53 AM

it was supposed to read [alt] + [Fn] 0225 (embedded) on a laptop, or [alt] 0225 (number keypad). The ¿ is 0191.

John March 5, 2014 9:14 AM

Not sure if it actually helps, but I also use KeePass to generate a) logins made of random characters; and b) Gibberish answers to the “security questions.” I hate the security questions in particular since they’re asking for more personal information (high school, grandparents) to “protect” the info they should already have secured better. They’re like an admission they’re not doing their primary task well.

Larry March 5, 2014 2:23 PM

I think your calculation of entropy is not correct.
correct horse battery staple has a higher entropy than 44 because an attacker does not know if your using uppercase, lowercase, numbers or whatever.

So, combining xckds and Bruces algorithms with Jim’s multiple language idea will produce nice passphraces:
cor$ect_PFERD~bat5erry.stapel

Larry March 5, 2014 3:02 PM

Another weakness of the password system I’ve long thought about is that the password is being ‘submitted’ as a field.
Just imagine, if every keystroke were part of the password, I could create a password like
123[backspace]4…

The same is valid for smartphone passwords, either they’re ‘1234’ on iPhones or some kind of spirals which is far too easy to oversee. 1-2-3-4-[backspace]-5-#-[backspace]-[backspace]-6 should be much better

Anura March 5, 2014 4:30 PM

@Larry

With upper case, lower case, numbers, and basic special characters you have 85 printable characters (including space, but not tab) on a standard US keyboard, which is approximately 6.409 bits per character of entropy for a random password, so with 20 random characters you have the equivalent of a 128-bit key.

If you included backsapce, it would only add .017 bits of entropy per character, which isn’t much; for a 20 character random password, it adds about a third of a bit of entropy. Plus, it would be very annoying if you accidentally type the wrong key.

The thing is, a lot of applications don’t even allow all special characters, or 20 characters, making lack of a backspace the least of your problem.

Larry March 6, 2014 10:54 AM

@Anura

I fear I didn’t express myself very clearly. What I had in mind is not the backspace as another character of the password, but the pattern of the password-entry.

Say your final password is 123. but you enter it like 124 – oh no sh*t delete the 4 and write a 3.

Or measure the time between the entry of characters: ‘1’ er, wait, oh yes , ‘2’, then 4, oh no … which makes quite a unique pattern for each user and will be more difficult to observe by someone looking over your shoulder.

But I agree, applications don’t allow or recognize such things for now. They see only the final ‘123’ that you ‘enter’.

AC March 7, 2014 3:42 AM

I wonder why the most of discussion is about web site passwords. Security in those cases is less relevant since:

  1. The provider has all your private data anyway in plaintext, and it can leak, sell and give it to advertisers or intelligence agencies
  2. No matter how secure your password is, someone can hack the system and get/release all the information (such as credit card numbers)
  3. Number of login attemps is usually limited, therefore brute force approach isn’t practical in most cases

Much more important and interesting use case is the choosing password for personal private information, such as PGP key, password safe, or Truecrypt container. In this case the attacker can easily launch offline brute force attacks. 50-60 bits of entropy is trivial to break even for hobbysts (64-bit RC5 was broken by Distributed.net in 2002). 100 bits is the bare minimum for decent security and >=128 bits is recommended. Better be safe than sorry.

XKCD example is bad one because it doesn’t really use random words and therefore gives a false sense of security. For example a “horse” is a common animal. How many common animals we have? cat, dog, mouse, rat, chicken, pig, horse, cow, fox, bear, wolf, etc. definitely not 11-15 bits of entropy. Same applies to other common words such as “correct” and “battery”. This is also the reason why most of password strength calculators can’t be trusted, they don’t take into account the popularity of words, prefixes and suffixes

If you really want to follow XKCD scheme you should select some words randomly and then use them (NO exceptions, NO cherry picking), but then memorizing the passphrase will be much more difficult. For example: “reflectivecrisplyblackishwhollyprayershora” or “volitionmisspentunsettlingdenimexaggeratorveld”. These are taken from list of 40k words, therefore they have only 6*15 = 90 bits of entropy.

Harald K March 7, 2014 9:08 AM

Alan Kaminsky, there’s no doubt that when Randall said random, he meant random. He estimated 2^11 options for each word, that’s a word selected randomly from a list of 2048 words.

Schneier’s scheme, however, DOES bring personal decisions, and our personal ability to produce randomness into play. For one thing, initial letters are not randomly distributed. For another, a cracker can easily make big lists of initialisms from e.g all bible verses, all lines in the constitution, all sentences in Lord of the Rings and all sentences in Frank Herbert’s Dune etc. Were your phrase in one of those? You’re out of luck.

A targeting attacker can do even better, he can pick texts the target is likely to have lying on his desk. So, for Bruce Schneier, you could compile initialism lists from his books, or cryptographic papers.

With Randall’s scheme, you know exactly how much security you have in the form of password unpredictability – for exactly the same reason as diceware. (Indeed, it IS diceware). With Schneier’s approach, cross your fingers and hope you’re as unpredictable as you think you are.

Scott "SFITCS" Ferguson March 7, 2014 9:21 PM

@Harald K
With Randall’s scheme, you know exactly how much security you have in the form of password unpredictability – for exactly the same reason as diceware. (Indeed, it IS diceware).

Cryptography is not a synonym for mathematics.
https://news.ycombinator.com/item?id=6916860
http://www.thoughtcrime.org/blog/telegram-crypto-challenge/

No – you don’t “know” how much security you have. You only imagine you do because you imagine the parameters of an attack.

Consider that in a large number of instances you can safely predict that a long password/passphrase is going to be composed of words (dictionary attack) because the user has made the mistake of “needing” a memorable password/passphrase. With a little information about the target those entropy numbers become irrelevant.
This because they followed poor advice (see Bruce’s writings on “trust”), and failed to use a password manager. It also indicates a high probability they reuse passwords and also other common failings (fail to change passwords, central points of weakness like “email my recovery password here where I have lower security”).

Security is hard. Good security is extremely hard (OpSec is not intuitive). You only need one low-hanging fruit when all your passwords are linked to lower your highest level of security password (see email password recovery).

Randall makes some good points (though I suspect his entropy calculation is out by a factor of 2). A long mixture of random words is harder to brute force than a shorter string of random characters – so what?. That’s not comparing apples to apples – the brute force difficulty of attacking equal length strings of random characters to random words is equal – but not when using tailored dictionary attacks.

Apropos of little – Microsoft’s advice is to use a sentence (vulnerable to a different sort of attack that’s much shorter than a simple dictionary attack).

A failing with both Randall’s and Microsoft’s scheme are “monkey see monkey do” i.e. battery horse staple (I’m going from memory) and My dog sandy are now often used as passwords – and they both form the basis of tailored attacks for low-hanging fruit. e.g. My$pet$name
We are not the unique thinkers we believe we are.

A problem caused by the bias associated with the “awe of large numbers” is failing to account for the context. Many of passwords will used in scenarios where entropy doesn’t come into play – limited password attempts. Protection against low-hanging fruit attacks should be a primary consideration. Random characters defeat pattern seeking attacks. Using a password manager removes the liability inherent in the need for memorable passwords.

Entropy is a measure of exhaustion. 50% of brute force attacks will succeed at less than 50% of entropy.

If you have 10 passwords then the weakest one is your measure of security – and an attacker will be happy with considerably less than 100% success rate (there’s more than one way to access your data).

I didn’t check Randall’s math to see whether he allowed for the same word, or variations of it. A good low-hanging fruit attack would – but would try repetitions or variations last (or use a molecular sieve approach e.g. first try most likely, second try most unlikely etc)

As for seeing Upper case as reducing difficulty compared to all lower-case… huh?

Did you consider that Randall’s example may have used the names of things associated with his desk?

Nime March 8, 2014 8:38 AM

It’s called paranoia. 8 chars random/notword string is nearly impossible to crack online. Make it 10 chars and you are safe. Make it 20 for offline passwords.

Steve March 10, 2014 2:20 PM

@AC

XKCD example is bad one because it doesn’t really use random words and therefore gives a false sense of security. For example a “horse” is a common animal. How many common animals we have? cat, dog, mouse, rat, chicken, pig, horse, cow, fox, bear, wolf, etc. definitely not 11-15 bits of entropy. Same applies to other common words such as “correct” and “battery”. This is also the reason why most of password strength calculators can’t be trusted, they don’t take into account the popularity of words, prefixes and suffixes

I’m seeing a lot of comments like this. A lot of people are completely missing the point of the XKCD comic. Randal didn’t pick “correct horse battery staple” off the top of his head. Each word was randomly picked by a computer from a list of relatively common english words and then he came up with the funny thought bubble in order to memorize it after it was picked.

Also its strength of 44 bits of entropy is assuming that the attacker knows exactly how it was picked. You could hand them the list of words you chose from and tell them you picked 4 words, and it would represent 44 bits of entropy. If the attacker knows less than that about how you chose your password, then it will be even harder. The 44 bits of entropy is a lower bound, it’s the conservative estimate of how good your password is.

For everyone who has their own pet method for picking passwords (“append the site name to the password”, “add some word and pass it through md5|sha1”), just think of that description as being part of your password… Imagine you could compile a list of all those little pet permutations that people make to their passwords to make them “more secure”. How many methods are there total? 100? 1000? A cracker could just go about his normal attack and also pass each “regular” password through each of the “pet obscuration methods”. It adds only a few bits of entropy. The XKCD method, which is basically the same as diceware, says “I’m going to go ahead and give up those few extra bits of description-entropy in exchange for actual hard security of the password itself.”

Clive Robinson March 10, 2014 4:08 PM

@ Steve,

There is a problem with the XKCD method as described by most people here and that is the assumption that each word in the list has equal probability of being picked by the attacker.

Let’s look at it this way, if four people randomly select a two thousand word list from the same six thousand word dictionary the overall probability is not going to be uniform some words are going to appear more frequently than others. Due to what is called the “pigeon hole effect” you have eight thousand “pigeons” trying to roost in only six thousand pigeon holes which means that as a minimum two thouand and one pigeons will be sharing a pigeon hole.

Thus if your word list is randomly picked then some will be of the higher frequency words and some of the lower frequency.

You then have to consider not just what the probability is for your “urn pick” -v- the attackers weighted word pick based on the overal probability they are aware of, but the urn picks for all the users on the systems you use (remember the attacker only needs to get one match to gain access).

Peter March 11, 2014 12:36 PM

I generally make a pattern on the keyboard which has no real meaning and throw in a few shifted keys and symbols. Then I don’t have to remember a password, just a pattern.
I usually write it down anyway.

Steve March 13, 2014 1:05 PM

@Clive

This is one of the weirdest things I’ve ever read.

There is a problem … the assumption that each word in the list has equal probability of being picked by the attacker

The point is, if I pick my words randomly then I don’t need to care how the attacker picks theirs. In fact, if I pick my words truly randomly, and the attacker picks words with some pattern, then I’ll actually be safer. The attacker will spend a lot of time trying their patterns, when in fact my choices have been spread evenly across all possible combinations.

if four people randomly select a two thousand word list from the same six thousand word dictionary

I don’t know why each person is picking two thousand words, the password example is to pick 4 words. Also, I want to be clear here, the people aren’t picking the words, a random algorithm is.

the overall probability is not going to be uniform

That’s not true, the probability of a word appearing in anyone’s list is exactly uniform. And the probability that any specific word appears 2, 3, or 4 times is exactly the same for every word.

some words are going to appear more frequently than others

Yes, but that doesn’t mean the probability of each one appearing is different. You may be confusing the actual outcomes with probability. That would be akin to saying if I flip a coin 10 times and it comes up heads 6 times, then the actual probability of heads is 60%.

Thus if your word list is randomly picked then some will be of the higher frequency words and some of the lower frequency.

Yes, but that doesn’t really matter. There’s no way to predict which words will be picked more often. It’s not like if the attacker performed this experiment lots of times and it turned out in his test that ‘horse’ came up more frequently, then ‘horse’ would be a good word to try in everyone’s password.

remember the attacker only needs to get one match to gain access

I hope you’re not designing a security system where to gain access to my account, you only need to guess one of 2000 passwords I chose for my account.

Please read about diceware and especially the faq. I don’t know what else to say. But I’ll just leave this here. I chose 4 random words from the diceware list, and the sha1 hash of them concatenated is:

c840a7b4c41bae91c50c138babd96af1ba5a9973

The exact way I generated this was on a Mac with:

gsort -R diceware.wordlist | head -n 4 | cut -f2 | tr -d '\n' | tee password | shasum

I’m running gsort (so I get the gnu version from homebrew instead of the system sort which is missing -R) with a random sort, selecting the first 4 words, stripping off the number in the first column, removing the newlines to concatenate the words together, saving it to the file ‘password’, and running sha1 on it.

So, not only am I telling you exactly the word list I’m using, exactly how many words I chose, and exactly how I’m putting them together… But, I’m also telling you that I did this on a computer, which the diceware faq warns against because it could weaken the random selection. And I’m also using less than their recommended number of words.

So if this general method is not a good one, and all the cards are stacked in your favor, it should be pretty easy to find out my password. Just go ahead and use your pigeon hole principle to crack it.

From my analysis, there are 7776^4 (3.66*10^15) different passwords I could have generated. At Bruce’s 8 million password tries a second, it would take 5290 days to try all combinations. And all I have to do is add a single word to increase the time by another 7776 fold.

Clive Robinson March 14, 2014 7:14 AM

@ Steve,

The XKCD process as outlined here,

1, Select a public dictionary ie Pocket Oxford English Dictionary (POED)
2, Build a private word list (PWL) from POED
3, Build password (PW) by selecting words from the PWL

The specifics for 4words and 44bits gives a PWL size of 11bits or 2048 words. The selection process for the four words is not well specified here some assume “human random” selection others some True Random physical process such as dice. What is not discussed is “word ordering” (with four words that gives 24 possabilities for the same random selection which if alowed would reduce the entropy by just under 5bits under certain assumptions).

Q1 :- Is that your understanding of the password building process discussed here?

This process or one similar (diceware) is assumed by many ICT pundits to be the best for all users to use and they recomend as such. Thus if followed by users all passwords would be selected from a subset of POED or equivalent.

Q2 :- Do you accept that the use of one (or possibly two) public dictionaries like the POED would be the result for large democraphics?

Now to the other side of the problem, password “attacks” are very rarely pure “brut force” they are modeld on the various ways humans –are thought to– select their “memorable” passwords. Importantly they are not based on attacking one password in issolation but attacking many to find one that will give access or an “easy” percentage and thus the attacks run effectivly in parallel for hundreds if not millions of users. The way this is done is largely and importantly based on having found and analysed many previous valid passwords, often released by crackers from large low value targets where system security has not been a premium consideration. These plaintext passwords are used to build word lists that are usually ordered by frequency of use that are then used in the recognised patterns from analysis of the plain text to synthersize probable guesses.

Q3 :- Would you agree with this based on the information given in this thread and the sources mentioned?

Now I don’t have an electronic published dictionary handy but I do have a printed one to hand. An examination shows around six hundred pages with between fifteen and thirty words a page but on average only about a third of the words per page falling into common usage / six letter / eaisly spelled / easily remembered catagory suitable for building a PWL. These words will as they fall into “common usage” appear in most similar published dictionaries like the POED etc. Which is why I said six thousand words as those a PWL of around two thousand words would be selected from.

Q4 :- Do you understand where I got my numbers from now?

Importantly whilst the four words selected from the PWL are equiprobable and the words selected for the PWL from the POED are equiprobable this only holds true for the single instance of a PWL.

As I indicated when more than one PWL is in use the distribution of the four words in passwords is nolonger equiprobable, there are various reasons for this, but the “pigeon hole” example is generaly the simplest to see. Another is to understand the issues behind why adding four or more equiprobable independent dice throws togther and normalising them changes the distribution from the flat equiprobable to the bell normal distribution.

The result will be over a population of people using the POED and their own PWL that some words will occure more frequently than others and this will be reflected in the four words selected for the password.

As the password attacks are directed at a “population” of password users not individuals, with the attackers using word lists that are ordered by the frequency the words have appeared in the population in the past. Where the attackers aim is to just find any password in the population, the bell curve probability of word usage in the population not the flat distribution of a single user aids them in their task.

Q5 :- Do you see the issue now?

Further above I mentioned that users might re-order the four words selected to make them more easily remembered,

Q6 :- Can you see why this would on it’s own reduce the password entropy?

Q7 :- Can you also see why a user when presented with an unmemorable collection of four words might well push the button repeatedly until they get a set they like?

Q8 :- Can you also see the further entropy reduction effect both of these would have due to the way it would further distort the population probabilities in favour of the attackers?

If not write yourself a script or three to simulate a population randomly and independantly picking words for their PWLs from a dictionary –the size of which has no common factors with the PWL size– and plot the distribution change of the word frequency in the population.

Chris March 16, 2014 5:12 AM

Everyone has at least 50 passwords.

This article and all it’s suggestions are worthless, because you can never come up with any scheme that can safely store that much required entropy in your brain.

Steve March 18, 2014 3:17 PM

@Clive

The XKCD process as outlined here,

  1. Select a public dictionary ie Pocket Oxford English Dictionary (POED)
  2. Build a private word list (PWL) from POED
  3. Build password (PW) by selecting words from the PWL

The specifics for 4words and 44bits gives a PWL size of 11bits or 2048 words. The selection process for the four words is not well specified here some assume “human random” selection others some True Random physical process such as dice. What is not discussed is “word ordering” (with four words that gives 24 possabilities for the same random selection which if alowed would reduce the entropy by just under 5bits under certain assumptions).

Q1 :- Is that your understanding of the password building process discussed here

Absolutely not. The XKCD process is like lightweight diceware. I agree that it’s “select 4 random words from a 2048 word list” (diceware uses 7776 word list). But the list doesn’t have to be a personal list, it can be completely public (like it is in the diceware example). The word picking is truly random, and word order within the final password definitely matters.

The strength in these systems is there even when everyone uses the exact same PWL (as you put it).

Q2 :- Do you accept that the use of one (or possibly two) public dictionaries like the POED would be the result for large democraphics?

I don’t understand this question.

Now to the other side of the problem, password “attacks” are very rarely pure “brut force” they are modeld on the various ways humans –are thought to– select their “memorable” passwords.

Which is exactly why XKCD and diceware don’t generate passwords in human selecting ways. So that attack vector is irrelevant. In fact, if my attacker is spending their time trying common passwords like ‘p@ssw0rd’, or common dates/colors/pet names, then I’m even safer with diceware.

Q3 :- Would you agree with this based on the information given in this thread and the sources mentioned?

No.

Now I don’t have an electronic published dictionary handy

Yes you do: http://world.std.com/~reinhold/diceware.wordlist.asc

Use this as your, and everyone’s ‘PWL’.

Q4 :- Do you understand where I got my numbers from now?

No.

Importantly whilst the four words selected from the PWL are equiprobable and the words selected for the PWL from the POED are equiprobable this only holds true for the single instance of a PWL.
As I indicated when more than one PWL is in use the distribution of the four words in passwords is nolonger equiprobable

No only do I disagree with your general approach (first selecting 2048 words from a 6000 word source, then selecting 4 words from those 2048. I’d say, just select your 4 words from the orignal 6000). But I don’t agree with your conclusions about your approach.

Let’s use your approach and use the worst case scenario for our password choosers. We start with a 6000 word dictionary. Person A picks 2048 words for their PWL. Person B independently picks 2048 words as well. Despite insane odds, they select the exact same 2048 words. Their PWLs have 100% overlap, not just a few words with slightly higher than average expectancy. Now they each select their 4 words for their passwords, but this time the words are actually independently selected, and aren’t necessarily exactly the same. This scenario is basically the same as the XKCD example. If they had selected their 4 words from separate lists (and the attacker wasn’t aware of the exact lists), then the passwords would be even stronger. The XKCD strength is a lower bound, it assumes the attacker knows your exact PWL and how many words you used.

there are various reasons for this, but the “pigeon hole” example is generaly the simplest to see.

The pigeon hole principle doesn’t favor specific pigeons. If you and I independently pick 100 words from a list of 150, then we’re guaranteed to have at least 50 common words in our lists. But this knowledge doesn’t help an attacker. There’s no way for him to know which words are more likely to appear in our lists unless we publish our lists. The diceware/XKCD approach throws out the idea of keeping the source lists secret, or of having everyone choose a separate personal word list.

If not write yourself a script or three to simulate a population randomly and independantly picking words for their PWLs from a dictionary –the size of which has no common factors with the PWL size– and plot the distribution change of the word frequency in the population.

I don’t think you’re ever going to get it. Or you’re just staying willfully ignorant. Please write your own script that takes your approach for several individuals, then hash their final password selections and save it to a file. Throw away their PWLs, Then using your supposed knowledge of the more likely occurring words, try to crack the passwords quicker than what a normal distribution would suggest.

Hah April 15, 2014 9:00 PM

The XKCD comic’s title text says “To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.”

I’m surprised that Bruce Schneier is one of the people that doesn’t understand information theory.

Wael April 17, 2014 2:53 PM

@Bruce Schneier,

Don’t bother updating your password regularly. Sites that require 90-day — or whatever — password upgrades do more harm than good. Unless you think your password might be compromised, don’t change it.

In light of the latest Heartbleed revelation, do you still uphold this posture? What if there are other “Heartbleed” type issues that remain unkown to us?

outofthebox May 2, 2014 2:16 AM

simple way to increase entropy with xkcd – spell one or more of the words in the string backwards, easy to remember…

outofthebox May 2, 2014 2:46 AM

make a mental note to self – 2nd word is always backwards, the three word string – troutmaskreplica = troutksamreplica the password cracker would have to try each word in the string forwards and backwards, but the user will only have to remember three words.

Martin Seeger May 3, 2014 11:05 AM

I rather disagree with the assessment.

For the complexity to matter, the service provider has to have blundered in one of the worst possible ways (by losing the data) but at the same time chosen a halfway decent password hashing method at the same time.

Is this case really worth the effort to chose a really complex password? In the case the blunder has happened, you need to change your password anyway.

Using less complex passwords raises the probability that the user uses a different password per site, which does more to improve the security than the most complex password of all times. It mitigates the worst risk of an “weak” password.

By requiring the user to chose complex passwords, we are trying to shift the responsibility/blame to the user. It does not belong there.

A XKCD-like password is safe enough for 99+% of all use. With about 40 bits of entropy it is quite good even from the complexity side. It does not matter where the entropy comes from as long as it is there.

As security professionals we always looking too much at the purely technical side and ignore basic human psychology. The basic weakness is that the evolution did not design us to memorize hundreds of 100bit-entropy-passwords. And we should not try to fix the mind but keep the responsibility where it belongs: at the place where the (hashed) password are stored.

The user has two responsibilities:

  1. Select a distinct password per site and memorize it
  2. Chose passwords that a third person will not associate with hime/her

The rest has to be taken of by the people running the system.

Martin

P.S. I elaborate in more detail here.

crypto-ninja May 25, 2014 9:23 AM

People just do not listen to good advice, as always
As Bruce said at the end of this post, the best thing is to use a password manager like Password Safe or Keepass (my favorite).
So you have to remember only ONE good password for that 🙂

There you can save in a secure way your random gibberish passwords like
&s_[!TnZhc_w.a”-lbFPts7F~DCir!vKi,_^h.V6&F;7Q`6o/dmrpB’~!)9~WP

Some services have a maximum password length, then just shorten the generated password to the maximum length allowed on that website or so
For each account another unique password

I have over 200 passwords saved, each is unique. When one of the webservices is breached, I only have to change that password, my other accounts are not affected. This happens even to BIG services, like Ebay right now !!
The keepass-database is encrypted and I back it up also on external media.

After all that media coverage about password-breaches and so, people should get rid of their “lazy-dog-password-style” which leads to such “smart” passwords like “letmein123” or “cr@zym0mB4” (which is not much better!!)

John May 27, 2014 12:48 AM

“He can try guesses as fast as his computer will process them — and he can parallelize the attack — and gets immediate confirmation if he guesses correctly.” – Please explain me this sentence to like a 5-year-old.

Wael May 27, 2014 1:46 AM

@ John,

1- Attacker got a hold of a password file, copied it on a local computer
2- Attacker tries to decrypt the file, and since it’s a local file away from server security protection such as anti-dictionary attack mechanisms – it’s called an off-line attack; offline from the authenticating server’s perspective.
3- Guess passwords as fast as his computer can means: he has a program that runs as fast as the computer can. The program is guessing passwords, not the attacker himself.
4- Parallelize the attack means the attacker can run multiple threads, or multiple processes on the same computer, or use several local or remote computers (botnets) to speed up the guessing process (with appropriate coordination, for example divide and conquer)
5- Gets immediate confirmation if he guesses right means: the attacker has a fast way to verify the guess is correct without trying the password on the authenticating server.

Clive Robinson May 27, 2014 2:38 AM

@John, Wael,

The “assumption” is it is the plaintext password that is found by the offline search. The reality is it’s the first plaintext found that works, which may not be the same thing if the password hash is not properly designed…

For instance assume a simple system of a printable char password, that then gets DES encrypted and the last 16bits of the cipher text gets stored as the check value. It’s easy to see that for what is a 64bit input there are going to be rather a lot of inputs that will produce the same 16bit output, and atleast some of them are going to be printable plain text.

John May 28, 2014 12:01 AM

@Wael, thanks.

Slightly related:

I wonder what you guys think of Mailinator “Alternate Inbox Names”. How easy that might be to reverse to the original?

http://mailinator.com/faq.html

What are “Alternate Inbox Names” ?

“There are 2 ways to get email into any given inbox. When you check an inbox, listed at the top is the Alternate Inbox name. Emailing that alternate name is the same as emailing the regular name of the inbox. For example, the alternate name for “joe” is “M8R-yrtvm01” (all alternate names start with “M8R-“).

Thus, you can email joe@mailinator.com OR M8R-yrtvm01@mailinator.com – either way, the email will arrive in the “joe” inbox (and nothing into the M8R-yrtvm01 inbox). What’s more, there is no way to guess an an alternate name. If you give out the alternate, only YOU will be able to check the emails because only you know the original inbox name. (Note: your email is still in a public inbox, just possibly harder to locate)”

No offline guesses here.

Wael May 28, 2014 12:40 AM

@John,

I wonder what you guys think of Mailinator

I don’t know what this is about. I read the FAQ, as I am sure you have. Think of it as a way to send a non confidential email to someone. Seems a gimmick to me, they ask for a signup to use alternate domain names, and you can sign up with Gmail? I also like their diagram here: http://mailinator.com/auth.jsp 🙂 You can forward your 100% publicly visible emails to your private or your “very” private gmail account! I bet you @Clive Robinson, whom you forgot to thank by the way ;), will have a fit over that one! It’s in complete conflict with his rule of thumb: “Clock from most secure to less secure — June 1, 2012″… As for guessing? It’s a different problem, I would think. Resembles steganography more than cryptography and authentication. Maybe I’ll mess with it later and see if there is a good use case for it. Wait! You can use it to spread rumors, I guess 😉

Alternate domains means you have a choice of other domain names than Mailnator.com, some are not so free. I say try the free ones, see how it works, and let us know…

PS: Out with it, John… You have vested interest in this?

John May 29, 2014 6:19 AM

@Clive Robinson
Thank you too, unfortunately your explanation was not on my level (“to like a 5-year-old”) but thanks, any ways

@Wael
Oh yeah, of course I have a an interest in Mailinator! As a user. The sign-up option is actually a new feature, I don’t use that. I use the service for the original purpose it was intended for: signing up for forums, newsletter (of course, not the whole newsletter series but only the freebie offer, which usually comes just right after singing up) and similar stuff. Not to online banking and such. Only one way: receiving mail.

The creator of Mailinator, Paul Tyma is friends with Robert X. Cringely; smart guy:
http://www.cringely.com/2011/05/10/what-the-heck-is-a-clickochet/
http://mailinator.blogspot.com/2008/07/bob-cringely-on-talkinator.html

John May 29, 2014 6:25 AM

The original email addresses for Mailinator were more than 10 random characters, they just changed the naming structure with the redesign to what it looks like now. So, from the old one you could more easily associate of a hard to guess random public email address.

A similar concept were (the now closed) Instawallet, a semi-public bitcoin wallet service. The bitcoin address was simply in the URL, when you visited instawallet.com, it automatically created a wallet for you with a custom URL. Behind HTTPS, of course. Simply, your URL were your password. (Then probably the creators stole some of the money, but that’s another story)

Miloske July 1, 2014 1:48 AM

How about using some historic ciphers to generate secure passphrases? “SECRETPASSWORD” is insecure, but what about “FRPERGCNFFJBEQ”?

In worst case scenario this is only 26 times better than the “plaintext” password, but it looks fairly random.

Vigenere and some other more advance ciphers would certainly be better. An attacker would not only have to guess passphrase, but would also have to guess which cipher was used and crucially which key was used. Correct me if I’m wrong, but I don’t think there are practical attacks that can do all this, even against ROT13, at least not yet.

Aegeus July 15, 2014 1:10 PM

@Miloske: Even leaving aside the security value (26 times as many passwords is 4-5 bits of entropy, which isn’t that great), how would you memorize it? Either you remember the plaintext and you need your computer to apply the cipher each time (in which case you may as well use KeePass), or you have to memorize the garbled passphrase (in which case you may as well use a completely random one).

Nicolas August 1, 2014 5:25 PM

@Steve

You make some odd statements.

The point is, if I pick my words randomly then I don’t need to care how the attacker picks theirs.

This is wrong. The only thing that ever matters concerning the security of your password is how it holds up against the way the attacker guesses. If you randomly choose 4 words, and all four of them are “password”, then the attacker will likely crack it quickly anyway, despite it being a good password according to your random dictionary word scheme. Admittedly, such a password is unlikely to be generated and you’d probably not use that password anyway, but it goes to show that your logic of “I don’t need to care how the attacker guesses” is absolutely wrong. Theoretical password strength against one attack does not protect you from a different attack.

In fact, if I pick my words truly randomly, and the attacker picks words with some pattern, then I’ll actually be safer.

Too general of a statement. You’re likely going to be as safe as you theoretically can be if, and only if his method doesn’t find your password by accident.
Say you choose a dictionary of great size and choose 4 random words from it. The attacker starts out with a small dictionary of common words and starts guessing. As the attacker goes on, they include more obscure words, and so on. The only case in which you are as safe as you think you are is when you have been lucky enough for at least one of your picks to be outside the attacker’s smaller search spaces, because then they’ll need to resort to the really big dictionary.
Truly random picks don’t mean your password will be safer against smarter bruteforce attacks, unless those attacks will cost the attacker significantly more time to execute. In a case of a “common words first” optimisation, the attacker basically gets it for free, because he doesn’t need to execute more guesses – he just makes the likely guesses (“more likely” for people who did not choose truly random words, or used smaller dictionaries, or just used different dictionaries) first.

Steve August 4, 2014 4:50 PM

@Nicolas

I’m sorry, you’re just wrong. I know for someone who reads xkcd, I should just give up when someone else is wrong on the internet, but here goes:

The only thing that ever matters concerning the security of your password is how it holds up against the way the attacker guesses. If you randomly choose 4 words, and all four of them are “password”, then the attacker will likely crack it quickly anyway, despite it being a good password according to your random dictionary word scheme.

The chances of my scheme picking ‘passwordpasswordpasswordpassword’ is 1/(7776^4) == 2.73*10^-16. Perhaps you don’t understand just how small that is. Your argument is equivalent to saying, “Yeah but what if the attacker just happens to try your password first”. That’s not how you measure password strength. That argument basically defeats all passwords no matter how long, no matter how many ‘special’ characters you allow, no matter how random.

You’re likely going to be as safe as you theoretically can be if, and only if his method doesn’t find your password by accident.

While I agree with this statement, it’s basically agreeing with a tautology, it’s meaningless. Even if your password generation method was a truly random GUID, then I’ll just say, “Yeah, but my attack is to just guess your GUID first.”

The only way to rate password strength is by evaluating the way it’s generated.

KSE August 7, 2014 1:22 PM

It occurs to me that there’s a possible weakness in Bruce’s method, which is that English sentences tend to have somewhat predictable structures, and some initial letters are far more common than others – so if I know you’re using that method, then I also know there’s an excellent chance your password begins with T, A, or I, and that certain two-letter strings like “ti” (“there is”, “this is”) or “wa” (“we are”, “who are”) are likely, whereas “xq” is extremely unlikely.

Not that I’ve done the math, but I’d guess that if you’re actually following Randall’s advice and picking multiple unrelated words, not a sensible sentence, you may be giving the attacker a much harder time.

Typically I advise people to go with the XKCD method, mixed-case, with spaces when the system in question supports it, and also misspell a word or inject a foreign or made-up word, or add a random punctuation mark or a number somewhere, anything to inject a tiny bit more unpredictability into the system. But maybe the best advice is to come up with your own system or variation on a system, since any approach that gets too well-known becomes a magnet for pw crackers.

Clayton August 10, 2014 3:30 AM

+1 Steve, you’re spot on re. XKCD.

If you’re optimizing for memorability, then optimize for memorability, not string-length. Capitalization, numbers, special-characters, etc. are all impediments to memorability, even if they can be used to create slightly more entropy density per character typed.

I have created a diceware password book which can be used to construct strong, memorable, XKCD-style passwords. The book consists of 1,296 concrete nouns (words like “house” “rock” “cat” “sled” etc.) 1,296 = 6^4, so you can roll a die four times to choose a single word at random and then repeat to string together as many of these words as you like.

Assuming there are no “bugs” in the book, you get 10.34 bits of entropy per word. I have not applied a rigorous substring search to formally verify that no word in the book can be generated from two other words (but I have eliminated any I found through visual search), so I would recommend that you build in a 10% safety factor (or, better yet, formally verify it, correct any mistakes you find, and post the updated book in this thread!) If you need 64 bits of entropy in your password, then string together 7 words, like so (follow the instructions in the book):

tub bucket herb city walnut panther coach

This password has exactly as much entropy as the following random string of numbers drawn from {1..6}:

6346146633252143656344242136

I would certainly like to see the password cracker that can “efficiently” guess a password from {1..6}^28. The fact that the number is encoded in concrete nouns chosen from the password book is irrelevant.

I recommend discarding all capitalization, numbers, and special characters, except as required by braindead password policies. To satisfy those, I recommend the use of a boilerplate such as “.1” or whatever to make the policy enforcer shut up and go away. This way, you will only need at most one retry when you remember the password but forget whether you had to comport with a password policy or not.

For non-memorized passwords (something you must write down and physically secure), you are not optimizing for memorability. In this case, you can use your favored random-generator of choice, but choose an encoding that eliminates visual duplicates such as lowercase-L, uppercase-I, zero, uppercase-O and so on. You also should think about portability since one problem created by password policies is that they make it impossible for a user to use a single generation method (when, for example, one policy requires special-characters, while another prohibits them).

And to all the biometrics snake-oil salesmen out there that want to eliminate the use of passwords: You can have my passwords when you pry them from my cold, dead brain!

John Holmes August 16, 2014 3:55 AM

Why create memorable passwords? Mine are just patterns. Give me a pen and I can’t help you. Give me a keyboard and I can type it. But I don’t know or need to know what it is.

Evan August 18, 2014 5:05 PM

The main problem is simply that getting to a sufficient level of entropy with the XKCD scheme will often require more characters that most web password forms. “Correct horse battery staple”, even without spaces, is still 25 characters – most sites I’ve seen limit passwords to 20 characters, and some still to 12. So with the XKCD system, you’re limited to between three four letter words and four five-letter words. That drastically limits the amount of entropy in your system – especially since each component of the password reduces the number characters available for the rest, meaning the selection of components to the password is not actually independent.

Clayton August 25, 2014 7:34 PM

@Evan: That’s a problem that has not yet been completely solved, but using a master password to unlock a password management program like PasswordSafe is a good way to work around this issue. Set the password policy for sites with absurd password length limits to get as much entropy in the password as possible, i.e. numbers, special characters, mixed-case, etc. Then, just let the computer remember them for you, while you conserve your brain’s energy to remember the strong memorable master password. It’s not foolproof but at least it can be done securely, unlike biometric tokens or the new rage, “social media logins”.

Willem September 4, 2014 2:22 AM

Here’s what I think would be the best way to use the XKCD strategy:

When a user creates an account, he receives a four-word passphrase from the system. He doesn’t get to choose it himself. The system could have a 1024-word dictionary for a guaranteed 40 bits of entropy (if the random number generator is good enough).

Here’s a challenge to password-cracker experts: Create a list of passwords each made up of four random words from a 1024-word dictionary. Preferably using a real source of entropy.

Try to crack those passwords. According to Bruce it should be easy, according to Randall, it should be very hard.

Dazed and Confused September 4, 2014 12:40 PM

Most of the banks and other financial sites I access do not tell you their password limits, for either length or allowed characters. Or maybe they make some vague statement like passwords must be “at least 6 characters long” but don’t tell you the maximum length. Sometimes if your password is too long they silently truncate it without telling you, and then you don’t even know your own password. Sometimes I call and spend a half hour in a phone tree, and then fumble around from one clueless twit to another trying to find whom to ask, and then get told random garbage which turns out to be wrong. What good is all the password advice in this thread when you don’t even know where to start?

The most useful advance in passwords would be some federal regulation to force these irresponsible corporations to adapt a clear password standard, or at least a rule requiring them to make a clear written statement — right on the same screen where you register the password in the first place — as to the complete requirements and limits for that password.

Clayton September 4, 2014 1:09 PM

@dazedconfused: “useful … federal regulation”

I saw one of these once… they’re rare, though, like honest lawyers, ethical politicians and Bigfoot.

rober September 9, 2014 11:07 PM

why do people properly spell a password?

length over complexity is the rule..

however..
do not properly spell anything..

I use songs, if i can entire phrases from songs.

for example.

666isthenumberofthebeast

666isthenumbrofthbeest.

properly spelled words are guarenteed to be hit by a simple these days dictionary attack, perhaps even common letter alterations of said password.
for example.
666isth3numbe4ofth3b34ast

stands a decent chance of being hit eventually.. get it wrong, remember it wrong.. and it becomes much more difficult for the automatic tools to guess it, but get it wrong as long as possible. use as many charecters as you can.

RG

Ade September 15, 2014 8:39 PM

Passwords have been driving me mad in the past months, I want to be secure but the various non standard restrictions of some sites make it difficult to come up with a one size fits all secure password technique.
It would help if all websites adopted a standard, Eg, min 12 chars, Uppercase Lowercase, Numerals and maybe agreed on 20 Special chars.

Kirk Parker November 14, 2014 3:21 AM

At least for interactive hosts, where the assumption is that a human is on the other end, isn’t a delay-upon-password-fail a complete protection against ever-faster processing speed? If your login process sleeps for 1.0 seconds every time a login fails, what avails the hacker if his network of bots can produce a new password to try every 1ms? Oops, I mean every 1us? Oops, I mean every 1 ns?

SO WHAT, if we only start listening every 1000000000ns?

Peter Liepmann December 24, 2014 8:31 PM

To all the folks who suggested limiting guesses- read the Ars Technica articles. Cracking presumes somebody has gotten their hands on a hashed passwords file.

Re ‘secret questions’ NEVER answer these correctly!! ALWAYS choose shocking nonsense.

EG: Mother’s maiden name is actually “Smith.” Then the answer to “What is your Mother’s maiden name?” is “Obscene,” or “foolish,” or”btfsk.”
Easy enough to remember- and you can put the info in the KeePass ‘text’ box, so different seekrit ansers for diffrnt sites.
ALso, FWIW, KeepPass is terrific, especially with addition of ‘Readable Passphrase Generator!’

Thrawn February 8, 2015 9:18 PM

I didn’t see a reply to Alan’s attempt at the Diceware math, so I’ll try one here:

Calculating permutations isn’t quite right. Permutations would only apply if the same word could never be rolled twice. Wrong maths. The correct calculation is simply (dictionary size) ^ (number of words in phrase). In this case, 6.25×10¹⁴.

As for 8 million guesses per second, Randall specifically stated that he was assuming an online attack limited to 1000 guesses/second. If you want to resist an offline attack, then the latest Diceware recommendation is 6 words, not 4. Diceware uses a 7776-word dictionary, but even your proposed 5000-word dictionary would yield 1.5625×10²² possibilities, which at 8 million guesses per second would take 4.954654997×10¹⁴ years to exhaustively search (about 2.5×10¹⁴ years on average).

Steve April 27, 2015 2:58 PM

It’s so sad that people still don’t get this.

To everyone saying, “Just reverse the second word”, or “Just rot13 the password”, or “just spell some of the words wrong”: Your schemes are only adding 1 or 2 bits of entropy to your password when you do this. Take the “reverse the second word” suggestion: All the attacker has to do to catch you (and someone who doesn’t reverse their words) is try the normal guess, and try the guess with the second word reversed. That effectively doubles the passwords the attacker has to try. That’s literally adding 1 bit of entropy to your password. You’re making it twice as hard for the attackers. This is the same as the ROT13 suggestion: try the guess, then try the ROT13 guess: twice as many guesses required to cover that case. However, just adding a single lowercase letter to the end of your password adds almost 5 bits of entropy. It makes your password 26 times as hard for the attacker! And adding one more word from the diceware list adds almost 13 bits, that’s 7776 times as hard!

You might say that you could make it more complicated by only reversing a word sometimes, and not always choosing the second word to reverse. Okay, so that means in addition to the original attempt, the attacker has to try: 1) First word reversed, 2) second word reversed, 3) third word reversed, 4) fourth word reversed. That’s adding 4 bits of entropy, it’s less effective than simply adding a single lowercase letter to the end, but it’s way more complicated for you to remember.

For everyone who has their own pet scheme for making their password more complicated, just know that if you could think of it, then an attacker could think of it and write an algorithm to apply that scheme to every guess. Instead of trying to rely on your scheme being secret, just throw that idea completely away and go with a scheme that is secure despite being known and generates passwords that are easier to remember.

Alex April 30, 2015 3:43 PM

That’s interesting. What do you think about absolutely random password? For example more than 10 characters, with numbers and special characters.

Thrawn April 30, 2015 7:40 PM

@Nicholas: I think I can see one of the reasons you’re disagreeing with Steve.

“Another is to understand the issues behind why adding four or more equiprobable independent dice throws togther and normalising them changes the distribution from the flat equiprobable to the bell normal distribution.”

Diceware does not add dice together. If it did, then yes, you quickly get a bell curve (2 dice will most commonly add to 7 and least commonly 2/36, 3 dice will most commonly add to 10/11, etc). But it certainly does not. Rather, Diceware rolls – which are ordered – become digits in a 5-digit number, eg 32646. That number is the index into the Diceware list. There is no adding or normalising involved.

Kevin Bone August 28, 2015 6:29 PM

Despite all the great tools and gadgets that are out there, the weakest link in security is still us. As long as humans manage passwords they will be broken.

Here’s an example:

Daughter: Dad, I need to transfer money into your Chase account.
Me: That’s great.
Daughter: I need your Chase login. I forgot mine and I’m locked out now for a while until everything resets.
Me: You can use mine. Just log into my LastPass (LastPass is the password utility I use) account and get it. Remember, I put my LastPass info in a locked document on your Google drive.
Daughter: I don’t have any of that available, just send me your Chase log in.
Me: No. Find the LastPass stuff.

…this goes on for a while and eventually she wears me down. I know it’s stupid to give her my Chase login in the clear, but I’m human, she’s my little girl, and so on.

What usually happens at this point is I send her several text messages, one for each of the characters in my password (have you ever tried to change your user code? almost impossible). This will include a couple of messages with fake information (I will call her on the phone to tell her which ones to ignore), probably a picture (if I send her a picture of the dog she knows the next letter is “n”, the last letter in his name), etc. I absolutely know that anyone out there can siphon this information off. This process ends with me reminding her to tell me when she’s done which she might or might not do. After a few minutes I’m going to call her and ask her if she’s done. If she doesn’t answer the phone call, I know she’s done because she would answer if she still needed access to the account. She prefers to TEXT me which effectively limits the length and complexity of the conversation. TEXTing is almost the only way any of my children talk to me any more.

Finally, I hop back on whatever device is handy (usually my phone) and create a new password to my checking account. LastPass lets me choose how long it will be, whether it can have special characters and spaces, how many digits I want to include. I like 12 characters with a bit of everything. Note that LastPass also has an option to make your password pronounceable which is a way of saying, “Easier to hack.”, but for most people using a password utility is a giant step from using the same password for most of their sites.

The point is that humans are and will always be the weakest part of security. When I was consulting I would frequently see passwords on sticky notes on the monitor. Personal passwords were bad enough, but they also had neatly typed out the pass phrases to use procedures which required higher security levels. The sneaky ones stuck it to the bottom of the keyboard. At one site the step by step instructions for printing a special check including the combo to the vault where the check forms were kept and instructions where to find them in the vault, instructions on how to use the signer including where to find it (Larry’s desk) and where to get the key to Larry’s desk (under Dan’s keyboard) and notes on items the security system audited and checked. These instructions were so good that a custodian was able to produce a check that was absolutely perfect.

For anyone who disagrees with me, please read Kevin Mitnick’s book “The Art of Deception”. If you don’t find yourself in those pages somewhere, you are truly unique…and anti-social.

Bruce October 10, 2015 10:20 AM

I’m probably not the first to come up with this but it occurred to me that one could intersperse two easily recalled names, for instance: I have two dogs named fido and spot would combining them into “fsidopot” fool a brute force password cracker?

Ted November 2, 2015 9:54 PM

I disagree about XKCD method, it is the best for the real world, with a few updates and caveats. First, you need to use a completely random word chooser from a dictionary of at least 10000 words, but I use 20000 words AND use 6 words pass phrases AND if you use a computer program make sure you’re using a crypto certified random library or function. I use one in Python. The six word phrases I use, without special characters, would still take thousands of years for the fastest computer farm and yes, using dictionary attacks. I agree, that using completely random passwords of similar length is better but not really usable in the real world by real people. I also have an option to add random special characters as well, but honestly, only necessary if the website/system requires them.

Clive Robinson November 3, 2015 12:36 AM

@ Ted,

I disagree about XKCD method, it is the best for the real world, with a few updates and caveats.

One caveat you did not mention is “don’t reorder the words”.

People assume that the order does not matter, when in fact it does. So they make the mistake of putting them in an order they can make a short sentance with etc, so that they can more easily remember them.

For those reading along and are not sure why this is, instead of words consider numbers instead.

If you have a two digit password then the number of passwords is going to be the full range fron 00-99.

However if you take the numbers and reorder them by value you reduce the range. Because 10 becomes 01, 20,21 become 02,12, likewise 30,31,32 become 03,13,23 and so on upto 90,91,92,93,94,95,96,97,98 which become 09,19,29,39,49,59,69,79,89.

That is you reduce the range by forty five down to ten unique numbers (00,11,22…) and fourty five duplicate numbers.

It’s quite easy to write a Python script to see just what percentage of range you would lose for three, four, five etc digit numbers.

Nico March 7, 2016 2:00 AM

I wonder, why hashing/salting is done the way it is. Wouldn’t breaking leaked PW-DBs be much harder, if hash&salt wasn’t a 1-stop process?

Why is “password” stored as
[salt]password -> hash -> hashedPW
instead of
[salt]p -> hash -> hashedPWa -> hash -> hashedPWs -> hash -> hashedPWs …..
-> hashedPWd -> hash -> endResultOfHashing ?

If timing attacks (while entering – the longer it is, the longer it takes) are of concern, it could simply be zero-padded to a reasonable MAX-PW-length (e.g. 60).

salt|p
hash|a
hash|s
hash|s
hash|w
hash|o
hash|r
hash|d
hash|0

50x

hash|0

Wouldn’t this massively increase the effort of breaking PW-DBs (rainbow tables, etc.)?

commentor March 22, 2016 9:19 AM

the “Schneier scheme” is not provably secure. It depends on the cleverness of the person choosing the password as compared to the cleverness of the person or group designing the password cracking algorithm.

Amir April 6, 2016 1:09 AM

This article is an example of Schneier posting complete nonsense (as he often does when he is promoting his own methods, or software, Password Safe for example, or encryption, etc. etc).

XKCD password security doesn’t derive from it being unknown to attackers. It’s SECURE even if the attacker knows exact method and dictionary used to generate the password. When Schneier writes something like, “The password crackers are on to this trick.” he is deliberately misleading the readers.

If the XKCD password words are selected randomly from say 30,000 word dictionary, you can calculate the entropy and be sure the password is secure even if the attacker has access to that dictionary and knows the exact method used to generate the password (i.e XKCD type password.

Pathetic attempt to mislead people.

Clive Robinson April 6, 2016 5:38 AM

@ Amir,

If the XKCD password words are selected randomly from say 30,000 word dictionary…

But that is not what many humans do, they use somebody elses short word list and a deficient selection method.

Worse having got their four words, they then rearange the words to make them easier to remember, thus vastly reducing the entropy.

It’s knowing this that some password crackers work on, and they do get to crack passwords that way or they would not do it.

So your comment of,

This article is an example of Schneier posting complete nonsense

When Schneier writes something like, “The password crackers are on to this trick.” he is deliberately misleading the readers.

And,

Pathetic attempt to mislead people.

It says rather more about your mentality than it does about other people. Perhaps it’s something you should sit and think about, because maybe it effects other parts of your life negatively…

Amir April 6, 2016 11:34 PM

Clive Robinson: “But that is not what many humans do”

Humans do a lot of stupid things, but that doesn’t change the fact that Schneier is misleading people to promote his own methods, software, encryption whatever else he trying to promote that day.

Picking four random words from even a tiny (5000 word) dictionary would be perfectly safe password for online account where brute force is detectable. Such a password would be easy to memorize too, as XKCD correctly points out.

Instead of making that point clear to his reader, the guy deliberately misleads his readers by posting things like “The password crackers are on to this trick” as if the security of such a password is derived from being unknown to attackers. It’s SECURE even if the attacker knows exact method and dictionary used to generate the password.

This is not the first time he has mislead people to promote his own stuff (remember his 2009 articles on how related-key attacks on AES 256 means we should not be using AES 256, even though everyone knows correctly implemented software should never use related-keys anyway).

There are still people on the web posting stupid blogs on why we should dump AES based on complete nonsense Schneier has posted for years about AES many times on this site.

Bottom line: Picking random words as your password is perfectly secure and easy to memorize. And even if the words are not completely random, such a password would still be trillions of times more secure than normal passwords that people use (like birth/wedding dates, or 12542sd, etc).

XKCD advice stays valid, despite Schneier pathetic attempt to mislead people.

KenC May 30, 2016 8:29 PM

This is disappointing. It seems that Schneier got it wrong on something that is pretty basic. It’s clear to even me, someone who doesn’t claim to be a security expert, that Munroe means cryptographically random words. There’s no “trick” to it that attackers can exploit. Personal information will not help at all. It’s just straightforward combinatorics.

David A. Curry August 5, 2016 9:03 AM

Oh, come on. The “Schneier Scheme”? Are you kidding me?

As someone else already pointed out, Ross Anderson et al. discussed (but did not take credit for inventing) essentially the same scheme eight years before you published “your” scheme. For that matter, I described essentially the same scheme eight years before they did (16 years before you) in my book (“UNIX System Security,” published 1992) as well as two years before that in my SRI paper (“Improving the Security of Your UNIX System,” published 1990). And I sure didn’t invent the scheme either, so it must have existed as prior art sometime earlier than that.

Best Practices August 5, 2016 7:36 PM

Best Practices for Provably Secure Passphrases

  1. Choose words for the passphrase from one list with a known number of entries.
  2. Choose each word randomly. Try the Playing Cards method or Diceware.
  3. Trust the math.

The number of possible unique passphrases is determined by the number of entries in the word list, raised to the Nth power, where N is the number of words in the passphrase. Choose a passphrase with 3.4 x 10^38 (2^128) possible combinations for full 128-bit security.

For 128-bit security, you need one of the following:
* a word list with 921 entries and a 13-word passphrase
* a word list with 1,626 entries and a 12-word passphrase
* a word list with 3,184 entries and an 11-word passphrase
* a word list with 7,132 entries and a 10-word passphrase
* a word list with 19,113 entries and a 9-word passphrase
* a word list with 65,536 entries and an 8-word passphrase

Don’t be clever. Cleverness is not provably secure. Obscurity is not provably secure. Only randomness and length are provably secure.

Clive Robinson August 6, 2016 12:28 AM

@ Best Practices,

3. Trust the math.

Err no because you’ve forgoton the human factor…

If you give a human a list of words and ask them to remember them, they will reorder the words to make some more easily remembered pass phrase.

To see what effect this has draw up a list of the hundred two digit numbers from 00-99, then reorder the digits and deduplicate…

Wael August 6, 2016 12:44 AM

@Clive Robinson, @ Best Practices,

Err no because you’ve forgoton the human factor

Exactly my thoughts, but with the mathematician being the referenced human, not the user. How can we expect a mathematically challenged person to “trust the math”? What we are effectively telling them is “Trust the mathematician” — and mathematicians can be, at times, schmucks, even those with Ph.D.s in mathematics — remember the Monty hall problem ? 🙂

Trust the math

We’ll see who follows this advice later 😉

RP August 7, 2016 7:01 AM

Your XKCD criticism is completely wrong.

The scheme presented in xkcd is the following:

  1. Use a common-words list, that has at least 2000 words in it
  2. Pick 4 words randomly from this list
  3. Append those 4 words together to get your password

Even if a hacker knows your exact strategy above, and knows every single word in your word list, that still leaves him with 2^44 combinations to go through. That’s the reasoning behind the xkcd scheme.

{} August 8, 2016 3:26 PM

@RP
I agree the xkcd scheme leaves a fraudster 2^44 combinations to go through… BUT my bank truncates passwords to 8 characters — which defeats any known password generation scheme, including Bruce’s. There’s simply no practical way that you can get enough entropy with an 8 character password.
My only option is to assume my online financial accounts have alreay been compromised, and take whatever measures I can to limit the damage.

{} August 8, 2016 4:21 PM

@{}
Before people rush to correct me: Using a truly random alphanumeric string of 8 characters you can get roughly the same entropy as a 4-word xkcd password. Practical for some purposes if you have a password manager that you can trust and verify (Which may exclude mobile devices? Definitely excludes technically unsophisticated users).

{} August 8, 2016 5:33 PM

@Peter
I’d be willing to bet that all memorizable physical patterns on a QWERTY keyboard are already on the lists of most common passwords.

Scott "SFITCS" Ferguson August 8, 2016 9:12 PM

@{} (not to be confused with a forum flooding sock-puppet)

<snipped>BUT my bank truncates passwords to 8 characters — I agree the xkcd scheme leaves a fraudster 2^44 combinations to go through… BUT my bank truncates passwords to 8 characters — which defeats any known password generation scheme, including Bruce’s. There’s simply no practical way that you can get enough entropy with an 8 character password.
My only option is to assume my online financial accounts have alreay been compromised, and take whatever measures I can to limit the damage.

[gently] Did I miss the meeting down the docks where that makes sense?

Real world scenario:-
Bank password is H8Kk_me! which requires a search space size 2.76 x 1015 to exhaustively brute force.
However, like most banks, the number of failed attempts before lock-out is 3, (upper limit is usually 5, YMMV).
Unless we poison reality by forcing a scenario where people have to choose a single-word “memorable password” the real probability of such a password being “guessed” is 1/94^6 + 1/ 94^6-1 + 1/94^6-2 = .000000000$Something%.
If your bank doesn’t limit failed login attempts you should take some responsibility, vote with your wallet and bank elsewhere. If, on the other hand, you need to use a memorable single-word password, you should not be trying to extend your expectation of privacy or security outside of your physical domain – but most importantly (which I’m sure you’re not), should not be forcing the lowest common denominator on everyone.

tl;dr? Chances are better that it’s rain (or spittle?), not the sky falling.

{} August 9, 2016 3:26 PM

@Scott “SFITCS” Ferguson
(TLDRv1: look up “hash” in a non-breakfast non-recreational context.)
(TLDRv2: think billions of guesses offline, not 5 guesses online)

There’s already been a lot of discussion on this thread that addresses the point you make, so I’ll try to keep this “short” (yeah right) 🙂

The assumption most of us are making is that an attacker has gained access to the hash table used by the bank to look up passwords, allowing the attacker to make hundreds of billions of guesses, not just 3 to 5. This isn’t speculation; this is based on real attacks that have already happened. And identity fraud for fun and profit has become one of the most common “blue-collar” crimes.

https://en.wikipedia.org/wiki/Hash_table

Salting helps increase the computational cost of making guesses, but stops only relatively unmotivated attackers:
https://en.wikipedia.org/wiki/Salt_(cryptography)

Even technically unsophisticated “script kiddies” can do it:
http://arstechnica.com/security/2013/03/how-i-became-a-password-cracker/

Half the problem would be solved if people were to use strong passwords. In order for that to happen, they have to be allowed to use strong passwords.
Read the other commenters’ comments on this thread for in-depth discussion.

My bank does have backup security mechanisms in place: For example, accounts are automatically frozen if unusual transactions take place from an unexpected country, and the bank is good about allowing customers to set a maximum daily limit on the value of online transactions. Still no excuse for limiting password length.

I’ve been the victim of identity fraud twice before, in separate incidents (neither of them involving bank accounts, thank heavens). Each time they’ve cost me hours of hassle and frustration. There is at least one additional case I know of where my credentials have been stolen, but I don’t know of any identity fraud against me in that case. In all three cases, the theft of client credentials was from large, well-known, respected organizations and companies. For obvious reasons, I’m not giving out details.

“you should not be trying to extend your expectation of privacy or security outside of your physical domain”
I’m not sure what you’re getting at here. Based on anecdotes from acquaintances, carrying one’s entire wealth around as gold bullion has a mixed track record. We have no choice but to be dependent on the currently existing financial system, where money consists of up and down magnetizations on nickel disks at an internet-connected remote site. Any economy is based on trust, and that trust is undermined if institutions can’t even get credentialling right.

ciao,
{}

Beatrice Block August 15, 2016 7:06 PM

Forgive me, BUT I have found that the more complex the password, the more likely the person will write it down and stick it in an accessible place. Complex passwords are somewhat effective against online snoops, but not for everyday use. People just don’t have the time or motivation to do complex passwords.

Why doesn’t the industry move already to biologic authentication, like your thumb print or your iris? Neither online thief nor real world thief can authenticate as you.

Citizenfour August 16, 2016 4:02 PM

Beatrice Block: There has been biometric authentication for a long time, have u even searched the internet? I dont think so.

ianf August 16, 2016 5:03 PM

ADMINISTRIVIA @ “Citizenfour”

Unless you are Ed Snowden, whose moniker it was in Laura Poitras’ film of the same title, could you please choose some other nick to go by here? If you intended it as a tribute to Ed, at least change it to Citizenfive, or CitizenNo4 or something, but change it. Thank you.

Tim August 19, 2016 5:49 PM

I’d suggest, rather than concentrating on raw entropy, concatenating whatever elements are necessary to frustrate any existing cracking strategies. For example:

7çs&5 more nuzgrs (FOR LENGTH)

Brute-forcing it? You’ve got 30 characters to chew through.
Is ç not in your brute-force charset? Fail. If it is, brute-forcing ‘nuzgrs’ got more expensive.
Checking only natural language constructs? 7çs&5 gets you.
Checking for a specific salt-plus-words pattern? I’m betting ‘nuzgrs’ isn’t in your dictionary, and the parentheses probably trip you up.

As an example of how this sort of password messes with crackers, the “zxcvbn” tool linked upthread ends up brute-forcing the word “FOR” in the same chunk as “nuzgrs”, costing it several orders of magnitude on top of an already impractical attack.

Is it memorable? I’d need to take it in chunks – “7çs&” “5 more nuzgrs” “(FOR LENGTH)” but each chunk is either memorable or short.

Clive Robinson August 19, 2016 7:20 PM

@ Tim,

Is it memorable? I’d need to take it in chunks – but each chunk is either memorable or short.

Which is the way people used to remember telephone numbers (back when phones were not smart 😉

As I’ve noted before the failing of paswords is the ~6lb of squishy fat between the users ears.

Amit September 21, 2016 9:32 PM

This blog is correct in some ways, but creating sentences for passwords is already a process to remember, and having different passwords for multiple services may confuse some people. The name for each website should give you an easy indication on what password to use

Dark helmet March 10, 2017 12:43 PM

For less secure, but often logged into accounts, I just transpose 2 characters in each word as well as add punctuation and numbers (at least 4 digits) between each word. The actual punctuation and digits can be the same.

Michael Power March 15, 2017 1:24 PM

For the past decade more complex passwords have been all the rage. Not just alpha but also; numeric, upper, special. However often these “improvements” are roughly equivalent to requiring one more character.

For example at least 8 characters that require alpha, number, upper case gives you 2.18 * 10 ^ 14 possible combinations
Whereas at least 10 character passwords that are only alpha gives you 1.41 * 10 ^ 14 possible combinations.

In short, if making your passwords longer is easier than more complex, that would probably give you similar security. I just generate random passwords using alpha + numeric excluding similar characters. When a web site requires upper, I just raise the last letter. When a web site requires a special I just raise the last number.

Eric March 15, 2017 3:22 PM

Regarding XKCD, I’m not sure that the issue about how many words are in the person’s vocabulary is as much the issue as in how many words are in the attacker’s dictionary that are used to attack the password. Unless, of course, the attacker know’s the target’s vocabulary.

That said, most people’s vocabulary is not all that extensive and so the attacker could probably cover a great many people’s vocabulary with a dictionary of maybe 10,000 words.

On my official e-mail site, which I hardly use at all, the requirements on the password includes both a minimum and a maximum length plus a requirement that it contain both letters and numbers plus a ban on any embedded words. The result is that my password on the site is not terribly long so that I can remember it.

My normal passwords are on the order of five or more words, at least one of which was chosen specifically because it was not in my regular vocabulary prior to creating the password. For example, I used to use zenzizenzizenzic in one passphrase. (Note that made any attacker’s effort to break my password easier by not requiring them to include zenzizenzizenzic in their dictionary.)

If the e-mail provider let me choose passwords like I like, they would be far stronger.

In any event, the e-mail provider I use most has a three step log in procedure:
1) enter the password for the account,
2) enter a one time six digit passcode (using FreeOTP), amd
3) enter the password to decrypt my e-mail.

As far as guessing passwords, several years ago I bought some equipment from a company hundreds of miles away that had gone out of business. Most of the equipment was new and had never been configured so it had the original default password, but one had been configured. So I was sitting in my laptop at a remote site wondering how easy it would be to guess their password. My first guess was their company name and it worked.

Clive Robinson March 15, 2017 6:22 PM

@ Eric,

I’m not sure that the issue about how many words are in the person’s vocabulary is as much the issue as in how many words are in the attacker’s dictionary that are used to attack the password.

You are not thinking it through far enough.

A users dictionary has to be limited to words they can spell. Now it just so happens that most people have a vocabulary the same as their social peers. Further as the vocabulary increases it includes most if not all the words of the lesser vocabularies as you would expect.

Thus an attacker would order the words used in an attack starting with the limited vocabulary words and work upwards. Thus they get the low vocabulary passphrases first.

For certain types of attack, all the attacker needs is to get into just one account on a system. Thus the low vocabulary passphrase is likely to be the weakest link.

Eric March 16, 2017 6:42 AM

@ Clive Robinson

There can still be large differences in vocabulary.

For example, a farmer and rancher outside of Amarillo will tend to have a somewhat different vocabulary than a housewife in Dallas, an attorney in Austin, or an engineer in Houston.

As far as getting onto the system, we are assuming that they have already accessed the system to get the password files. If they are trying to break into a computer by guessing passwords, they are going to be far more limited in guesses.

Can you imagine just checking all possible six digit passwords to log into an account using ssh? From the logs of our servers, about the most number of times I’ve seen anyone connect is about 1,000 times before they gave up. I have them set to drop the connection after six tries, that would give them a maximum of 6,000 passwords to try. That wouldn’t even be enough to try all possible 2 character passwords.

If they tried to brute force all six digit passwords, 95^6=735,091,890,625 over ssh, I think I’d notice. In any event, with all but one of my servers listening for ssh only on a single IPv6 address per server, with root logins denied and with logins via passwords denied (except for s/key), I’m not worried.

Dirk Praet March 16, 2017 7:51 AM

@ Eric

In any event, with all but one of my servers listening for ssh only on a single IPv6 address per server, with root logins denied and with logins via passwords denied (except for s/key), I’m not worried.

I generally recommend adding SSHGuard or Fail2Ban to the equation. They automatically block suspicious ssh connection attempts and alert to them.

Clive Robinson March 16, 2017 3:29 PM

@ Eric,

As far as getting onto the system, we are assuming that they have already accessed the system to get the password files

That’s very much a false assumotion these days.

There are billions of user details and hashed passwords available on the Internet without you having to hack in to get the password files.

In general humans are not good at remembering things, thus they have a habit of reusing passwords on many systems some weakly protected some strongly protected.

Often enough you can identify users in those published files with other things they have done online, and you can rapidly build up profiles of them including friends, family, where they live, where they work.

Often you can find their work internal username and other significant information from things they post in open sites. The worst offenders by far for this are low end SysAdmins and developers under time constraints. Which are exactly the sort of people you want to target if you are into ibdustrial or national espionage (which is why the NSA is reputed to do this as SOP). Oh and they are also the people more likely to not only share a vocabulary but also use the XKCD and similar systems (badly…).

Eric March 16, 2017 3:31 PM

@ Dirk Praet

I hadn’t thought of those. I see that SSHGuard is one the OpenBSD ports now. I may try it out.

By the way, the pf packet filter on OpenBSD has the ability to filter out IP addresses that connect too frequently. For example,

block quick from
pass in on $ext_if tcp to $ssh_list port ssh keep state (max-src-conn-rate 20/60, overload flush global}

So if someone tries to connect to your hosts on ssh more than twenty times in sixty seconds, they are added to the evil_ssh table and blocked. Also, the “flush global” will remove the existing entries from the state table for that IP address.

I’ve used this on my firewall on many occasions. One nice thing about building a firewall with OpenBSD is that you can build one without it having any IP address at all making it rather difficult to be attacked. If you need to connect to it from a different room, just run an RS-232 cable to it and hook it up to a VT-100 compatible dumb terminal (I use a Dorio from DEC — it’s kind of old but still works fine).

ab praeceptis March 16, 2017 4:19 PM

“SSHGuard”, “filter out”, etc.

I find it amazing how rather than rejecting weird and/or third rate approaches and replacing them by better ones, ever more variations of shady “solutions” come up and even are celebrated and forked.

True, recognizing and filtering out evil connection attempts make sense – on a public interface that isn’t meant to be limited per se. An example would be aweb site that is meant for the broad public.

Many services, however, are not meant for general and public use. Most FTP sites are examples and certainly SSH.

Did really nobody notice that “keep the doors wide open and only check late” isn’t a smart approach?
Looking at how PubKey works it should have been noticed by now that it also lends itself well to DOS attacks via exhaustion.

Yet the majority of SSH ports are open to whomever pleases to connect and only at a rather late stage are visitors checked (authorization).
If there is any protection in place it’s usually of the SSHguard or fail2ban type, i.e. an approach that assumes every connection attempt to be legitimate unless it “behaves” obviously and utterly bad. Didn’t it strike anyone to ask what kind of knowledge is required to do that (as opposed to knowing e.g. SSH)?

The problem behind it is a really simple one, a closed user group kind of problem. Certainly the users authorized to use SSH at any given server is known and limited.

So why are we at tls 1.2 (or 1.3, depending how adventurous you are) – and still the real and simple “front door” problem isn’t solved but rather delegated to tools of doubtful quality.

ssl/tls/ssh a generator that just doesn’t stop to produce problems, crap, and strange “solutions” …

Eric April 7, 2017 12:33 PM

Clive:

There are billions of user details and hashed passwords available on the Internet without you having to hack in to get the password files.

Actually you do have to hack in to get the password file. Without it, all you can do is connect from the outside and try a small number of passwords and guess at the accounts. It would be very unusual, I think, for anyone to be able to try all possible passwords that are four characters long if they had to open an ssh connection to the machine and try them by brute force.

With the password file, you can test passwords against each username at incredibly high speeds. That could get you into accounts on that machine.

After that, knowing that user john99 used a password of “cowboy” on one machine, they can try that username and password on other machines to see if he reused he password elsewhere, which is a very common practice.

For what it’s worth, I have more than 200 (maybe 300) internet devices with the same username and password. I initially tried using a different password on each one, but that turned out to be nearly impossible to track by about the 20th device. I do block all incoming ssh, http, and https access to those with internet routable addresses which is most of them.

Eric April 19, 2017 2:47 PM

To try to keep the number of connections down from attackers trying to guess passwords, I began an experiment yesterday on a computer. I downloaded the US zone files from ipdeny.com (limited to blocks of at least 65536 addresses) and started filtering out all ssh attempts from IP addresses not in any of the blocks of those US zone files.

In the 22 hours since then, only two IP addresses have connected to sshd on that computer to guess password and well over a thousand attempts have been rejected.

My next step is to log all the tries that are not rejected from non-US countries and see what else they are hitting.

For what it’s worth, just one of my servers saw more than 630,000 log entries from the 166.31.166.0/24 block in two months. There seems to be an enormous amount of password guessing coming from that block, apparently all directed at root.

Even then, 630,000 attempts is more than a magnitude too low to test all passwords made up of 4 upper and lower case letters with no numbers of punctuation. It seems more likely that whoever is behind that is trying lists of passwords that they have from other sources.

Russell May 8, 2017 11:42 AM

This is a great article and has increased my awareness and knowledge of how to protect my interests, especially banking.

However, I live in the UK and have some questions about how Password Safe might work for me. Don’t hesitate to advise if these need to be put to a different blog.

The log-in requirements for most of my personal banking and credit card accounts, involve either:

a user name, password and memorable information to access the accounts; or

a password and PIN from which characters have to be selected and entered.

In the case of the former, copying and pasting the password from Password Safe would work as the memorable information provides the second stage security, though this doesn’t appear not to be able to be stored on Password Safe.

In the case of the latter copying and pasting the password would deny me access. The only way I envisage Password Safe would help me is by going to Edit Entry to reveal the password each time I need to access the account. This is a bit tedious. Is there a better way of accessing my stored passwords that I have not yet discovered? (Though the random password generator and storage of the password undoubtedly increases my security.)

Has any other user experienced similar issues and found ways of dealing with these?

Russell May 19, 2017 10:09 AM

Three years late, this is in reply to the very first post to this article, by Ted Lilley on March 3, 2014 8:04 AM, in defense of xkcd comic #936.

Ted, if you’re still around, note that #936 was posted to xkcd.com almost three years before this article was written. It was pretty good advice at the time, things just accelerated a bit in the meanwhile. Even if you think it was not good advice when you posted as many who subsequently posted have argued, note that Bruce said it’s “no longer” good advice, implying he thinks it once was. Your characterization of “debunking”, therefore, was decidedly disingenuous at the time you wrote it, unless you’d follow up with an argument that it was never good advice. Perhaps you forgot to check dates and think it through before posting. It’s a shame your post has been sitting there for three years as the first thing one reads after reading the article.

There. I feel much better now. Sorta. Thank you.

Russell C. May 19, 2017 10:32 AM

This is for the other “Russell”, regarding PasswordSafe in post May 8, 2017 11:42 AM. (I didn’t realize there was another Russell just before my last post about xkcd, or I would’ve added the “C.” I have above!)

In your first case, I store such “memorable information” in the notes for an entry. That does require being able to read that, which PasswordSafe for Windows won’t do unless you open the entry for edit (last version I tried, 3.33). PasswordSafe for Android (and Blackberry) do show you both the notes and the password (if you tap it) in view mode (version 6.8.1).

In the second case, you are correct, you have to open the entry for edit to see the password and PIN if you’re using the Windows version of PasswordSafe. I suppose that’s a pain, but it doesn’t bother me; ymmv.

With these caveats, I have not found a better system, or even one I trust more. I’ve been using it since shortly after it was published, and I use it for all my clients as well (separate databases for each).

Hope that helps.

Anthony Maw August 2, 2017 2:19 AM

In Microsoft Windows and probably other operating systems you can also use non alphanumeric and even none of the special ASCII characters.

For example an element in a password can be the copyright symbol © which is ALT+0169. or the trademark symbol ® ALT+0153

I wonder how well the password crackers will fare on a string of 12 characters including alpha numerics + special characters and + ALT characters ??

Try it!

Liquid September 19, 2017 3:49 PM

Great comment Anthony!

For example, password “vb454A1™©-PW” has 147-bit security.

[snipped by moderator; smells like product promotion]

Clive Robinson September 19, 2017 4:39 PM

@ Liquid,

Re “s[***]s” white paper.

It appears the author does not understand that a hash or true one way function does not increase the entropy of the input at the output.

[product name deleted by Moderator]

Liquid September 19, 2017 6:20 PM

@ Clive,

Your comment is irrelevant. What’s your point?

Hash entropy is clear. The perfect one-way function does not exist! Do you understand the principle of PBKDF2?

Are you a mathematician or cryptographer? Or the NSA agent?

Clive Robinson September 20, 2017 3:47 AM

@ Liquid

Your comment is irrelevant. What’s your point?

That tells me three things.

But first do you understand the difference between entropy and work factor and why it’s important to the likes of one way functions?

Secondly why are you not using your own name… Or should I ask what your connection is to “s[***]s” and it’s white paper?

[product name deleted by Moderator]

Wael September 20, 2017 6:07 AM

@Clive Robinson,

I couldn’t find any names on the link either.

@Liquid,

Steganography, for hiding information inside the picture so thats it is impossible to detect this

How can you assert that? Pen testing?

Moderator September 20, 2017 10:42 AM

@Liquid, you are rude, and you appear to be promoting a commercial product while concealing your association with it. I have therefore deleted references and links to that product in your post and posts responding to it. Please take your attitude and your crypto-sales-pitch elsewhere.

BatteryStaple October 29, 2017 7:12 AM

Only two things are provably secure: randomness and length. Randomly choose a long passphrase from a long wordlist, e.g. 10 words from a list of 10,000 words. That’s secure.

Clive Robinson October 29, 2017 8:21 AM

@ BatteryStaple,

Only two things are provably secure: randomness and length.

Neither of which the human brain was designed for. Which gives rise to a problem, that humans will use a system that gives both, but then they will mung the results to something they can remember as easily as they can…

In this respect humans are their own worst enemy…

To get around this a little bit you can hwve say five words to form a pass phrase. But use different word lists for nouns, adverbs and connectives, such that the words can form a simple sentance.

The NSA had a system years ago that built faux but pronouncable words for a single password their lists were vouls and consonants, and word format templates.

Eric December 20, 2017 11:45 AM

Clive:

Generated pronounceable passwords should be considered to be far less secure than they would appear.

Presumably, there is a list of syllables that it chooses from. If an attacker suspected that the passwords came from such a generator, it could be relatively straightforward to try every combination of syllables from the list up to whatever length is desired.

For example, if the your password was promodcloon it would appear to be 367,034,448,987,776 possible passwords (26^11) to try in a brute force attack but any attacker having reason to believe that it was generated from a password generator using a list of 200 well known syllables to create a three syllable password would only have to try 8,000,000 passwords to break it.

In contrast, three words randomly chosen from a dictionary of 20,000 words would require 1,000,000 times as many combinations to try.

Even better, do plays on words as part of your combination. For example, instead of using “quacksalver” as one of your words, use something like “quacksalverology”.

justinacolmena March 25, 2018 5:35 PM

    $ head -c12 /dev/urandom | base64

This is my favorite technique. It encodes 12 bytes of randomness to base64 as a 16-character password. The allowed characters in base64 are

    10 0-9
    26 A-Z
    26 a-z
   + 2 +/
=======
    64

A full 96 bytes of entropy, but even that is difficult to remember, unless you write it down or store it in a password manager. https://pwsafe.org/

PS June 11, 2018 6:40 AM

A while ago I’ve created a mac app that generates passwords. It’s called quickpassword and one of the options is “dictionary style”. I have used a dictionary file which creates random combinations, and it’s surprising to see how many words it uses that you know but never use.
the other option is to create large random string, but I hardly ever use that myself because of the tedious work when you need to re-enter it.

Ludovic F. Rembert April 21, 2019 7:24 AM

I typically advise clients and their employees to use the exact model suggested by commenter ‘justinacolmena’ above. Of course, 96 bytes are impossible to remember for the average Joe and Jane. I tend to side with Bruce that open source tools are preferable to proprietary counterparts. However, in the wake of the 2015 KeeFace debacle (https://www.tomsguide.com/us/hacker-tool-keepass,news-21782.html) I have moved all clients over to a customized version of PWS. To date, the tool hasn’t been “hacked” afaik. Contrast this to the popular commercial alternatives, a few of which have been hacked like KeePass.

Ray B Morris November 14, 2020 9:32 PM

Bruce didn’t do the math in this one, and it shows. I’m sure if he’d taken a few minutes to do the math he wouldn’t have posted it. The math is that the XKCD method is 8,192 times stronger than the Schneier method.

Schneier says “password crackers are on to this scheme”.
It doesn’t MATTER if bad guys know that some people choose four random words. They also know that password managers will generate 10 completely random characters. Attackers know about AES and scrypt too. That doesn’t mean you shouldn’t use AES and shouldn’t use random characters.

By saying attackers “are on to this scheme” he is actually arguing for security by obscurity. He knows that we should ALWAYS assume that the opponent knows the scheme. Assuming the opponent doesn’t know the scheme is security by obscurity (bad). The entire field of security is about coming up with schemes that work REGARDLESS of whether it’s widely known or not.

Schneier has made no argument that the XKCD scheme is ineffective. He’s only said it’s well known. Yeah AES and SHA384 are well known too. Widely known as being extremely secure.

Assume a word list of 100,000 words. That’s 16 bits of entropy per word. So four words, without any additional parts to the passphrase, is 64 bits of entropy. (You can easily add a few more bits with misspelling, dropping letters, and the like).

That is to say, if I know that you’ve chosen four random words, there are 18 quintillion 446 quadrillion 744 trillion 73 billion 709 million 551 thousand 615 combinations I need to try.

If I’m “onto your method”, I suspect that the password you’ve chosen is one of the 18,446,744,073,709,551,615 word combinations. Okay, what am I going to with that? Try all 18 quintillion?

In the article, Schneier suggests 9 random LETTERS.
Each letter is just under 6 bits of entropy. Using mine letters, you get 51 bits of entropy. That is, Schneier’s method is 8,000 times WEAKER, even assuming that the attacker knows you used the XKCD method.

Selecting letters and numbers completely at random is 6 bits per character. Adding punctuation at the end is three bits. So 10 completely random letters and numbers, plus a punctuation mark, is 63 bits.

Four random words is stronger than 10 random letters and numbers, plus a punctuation. And four words can be easily remember.

Tom S November 18, 2021 10:04 PM

My biggest beef with a lot of websites, especially financial institutions, is that they limit password length. NIST.800-63b Section 5 states that sites should allow at least 64 characters.

Although the banks that I use don’t allow for a full 64 characters, the online brokers that I use do. I do not use the PasswordSafe generated passwords. I run MacOS and use the following to generate my passwords

head -c 256 /dev/random | openssl sha3-512 -binary | base64 | tr -dc A-Za-z0-9 | cut -b1-length

where I set length to the maximum that the site allows. If it is 64, the I am using “cut -b1-64”. I figure that breaking this is equivalent to finding a collision on SHA384. Breaking a 43 character password would be the same as finding a collision on SHA256. Neither are very likely in my lifetime. Each character yields approximately 5.95 bits of entropy due to “+” and “/” being stripped. Mathematically, you get a lot more bang for your buck by increasing the number of characters in your password than by increasing the set of characters that can be used.

I then store the generated password in PasswordSafe because there ain’t no way I’d ever remember any of my passwords. Btw… if a site requires symbols, then I simply replace the last character with a “!” and accept the loss of 6 bits of entropy.

Marcel December 26, 2022 7:51 PM

Why on earth would I use the Schneider method if I can just use the sentence that I derive the password from as a passphrase? So a sentence such as “I take the bus to work every other Monday” cannot be cracked like ever. At least as long as we still believe in math over hearsay. 200.000 ^ 9 is a lot.

Even if there is a languistical trick to only chose valid grammar structures I hope this will not reduce down to less then 200.000^5 which is still around 3 * 10^26 which we call in my language “genug”.

John Gonder January 15, 2023 7:15 PM

The problem with this is when people do things on phones and pads where it takes going back and forth amoung 3 keyboards to enter complex passwords. They will certainly chose something they can actually type reliably on that device. In this case I still like /LONG/ memorable [to you] unique passphrases – long being good.
Add other things if you like but, e.g. bank: gavEmeaStupidPenAndLostMypaycheck
Of course I always tell friends to /not/ bank on their phone.

Clive Robinson October 2, 2023 6:47 AM

@ idk,

“Legit what’s wrong with the xkcd method”

Mainly three things,

1, Low entropy density
2, Humans mung the output
3, Systems won’t use long strings correctly.

The XKCD method has an “alphabet” of say a thousand words made from the common human spoken English words.

If you consider the entropy on a character by character basis you will see the entropy is very small, and smaller still on a bit by bit basis of say ASCII.

Worse humans will reduce the entropy further two simple was they will do this is,

1, Rearange the word order they are given.
2, Keep generating a new set of words till they find a set they like.

And some will do both.

But you also have to consider how the system the password is used on stores the passwords…

Put simply the amount of storage you give each user does not matter if you have less than a hundred users, but what if you have a hundred million users? Then storage space becomes critical, and long passwords either are not alowed or end up being compressed in some way.

As a rough rule of thumb compression is very hard to do in a lossless way when you are looking to get a short binary array as a result. The likes of hashing as used for documents that are to be signed, are not particularly fast and in the past software developers have taken short cuts and used inapropriate algorithms.

Such systems develop “favoured pet status” so a real dog of an algorithm gets used over and over and has a life span you would not expect…

Such algorithms have all sorts of problems and the entropy can vanish like early mist on a summer morning.

So at the end of the day, the problem with authentication by “passwords” of any construction is “humans”, how we end up solving this may not be any better, after all think about the equivalent of password changing when a chunk of that password is a bio-metric… I’d suggest not using your thumb print, as it’s removal would make drinking coffee in the morning and eating a sandwich for lunch difficult if not messy…

Winter October 2, 2023 7:32 AM

@legit

Legit what’s wrong with the xkcd method

As @Clive lists.

To simplify the argument.

The xkcd method let’s you choose a number of words, eg, 4, from a list of, say, 1000 words. Although you might end up with a long password in number of characters, it is still only a list of 4 symbols. Each symbol corresponds to just 10 bits of entropy. So, a 4 word password has 40 bits of entropy. That is equivalent to the strongest 7 character password you can create. No one would advice you to use a 7 character password.

If you want a moderately “secure” password with 90 bits of entropy, equivalent to the strongest 15 character password, you would have to randomly select at least 9 words. Note the word randomly, that is by throwing dice or such thing.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.