The Security of Multi-Word Passphrases
We found about 8,000 phrases using a 20,000 phrase dictionary. Using a very rough estimate for the total number of phrases and some probability calculations, this produced an estimate that passphrase distribution provides only about 20 bits of security against an attacker trying to compromise 1% of available accounts. This is far better than passwords, which are usually under 10 bits by this same metric, but not high enough to make online guessing impractical without proper rate-limiting. Curiously, it’s close to estimates made using Kuo et al.’s published numbers on mnemonic phrases. It also shows that significant numbers of people will blatantly ignore security advice about choosing nonsense phrases and choose things like “Manchester United” or “Harry Potter.”
This led us to ask, if in the worst case users chose multi-word passphrases with a distribution identical to English speech, how secure would this be? Using the large Google n-gram corpus we can answer this question for phrases of up to 5 words. The results are discouraging: by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users. The returns appear to rapidly diminish as more words are required. This has potentially serious implications for applications like PGP private keys, which are often encrypted using a passphrase. Users are clearly more random in “passphrase English” than in actual English, but unless it’s dramatically more random the underlying natural language simply isn’t random enough.