LLMs Generate Predictable Passwords

LLMs are bad at generating passwords:

There are strong noticeable patterns among these 50 passwords that can be seen easily:

All of the passwords start with a letter, usually uppercase G, almost always followed by the digit 7.
Character choices are highly uneven for example, L , 9, m, 2, $ and # appeared in all 50 passwords, but 5 and @ only appeared in one password each, and most of the letters in the alphabet never appeared at all.
There are no repeating characters within any password. Probabilistically, this would be very unlikely if the passwords were truly random but Claude preferred to avoid repeating characters, possibly because it “looks like it’s less random”.
Claude avoided the symbol *. This could be because Claude’s output format is Markdown, where * has a special meaning.
Even entire passwords repeat: In the above 50 attempts, there are actually only 30 unique passwords. The most common password was G7$kL9#mQ2&xP4!w, which repeated 18 times, giving this specific password a 36% probability in our test set; far higher than the expected probability 2-100 if this were truly a 100-bit password.

This result is not surprising. Password generation seems precisely the thing that LLMs shouldn’t be good at. But if AI agents are doing things autonomously, they will be creating accounts. So this is a problem.

Actually, the whole process of authenticating an autonomous agent has all sorts of deep problems.

News article.

Slashdot story

Tags: AI, LLM, passwords, random numbers, reports

Posted on February 26, 2026 at 7:07 AM • 9 Comments

Comments

Matthias Urlichs • February 26, 2026 8:26 AM

Heh. That’s not just an LLM problem. Humans do that too: we all know that the correct way to create a password is to fire up “pwgen”, or ask your password manager or whatever, no exceptions — but when we’re in the flow and need a quick password-ish string, we still resort to hitting a not-quite-random bunch of keys. Or just type “$ekriT1248”.

The real issue is that the distance between institutional memory (the LLM knows how a password should be generated if you ask it!) and short-term objectives is too large. Fixing this requires access to a tool — followed by training, to break the pattern of not using it. In fact, the frontier labs should probably just fix training input: replace all literal password-ish strings with instructions to do an MCP call.

Vesselin Bontchev • February 26, 2026 8:31 AM

Programs designed to generate statistically likely words happen to generate statistically likely passwords. News at eleven.

a clown • February 26, 2026 9:16 AM

Laziness and ignorance have their consequences.
There are places where you can order food, drinks, medications, etc. etc. so you do not have to get out of your car. These places were invented in the USA so people could have the convenience. This has only added to the OBESITY epidemic in the USA and many other countries that thought “if Americans are doing it – it must be good” (LOL!).
Add the High Fructose Corn Syrup and much of other GARBAGE that is legally being fed to all Americans, and many many other things that are actually cancer generating substances but hey, who the fck cares. If it makes a buck – Bring it on!

In the world of cybersecurity, there’s a price to be paid for taking shortcuts (laziness is sometimes also called “time saving measures” or “efficiency” or “productivity” or blah blah blah) and the key thing here is When and Where to resort to an App to do something for you that will be better, more secure, than if you’d done it yourself, the old, “slow” manual way.

Patrick Gill • February 26, 2026 9:25 AM

LLMs used to be bad at arithmetic too. How long before a good LLM will know to defer to /dev/urandom when it needs entropy to make a password? This seems like a fixable problem.

Clive Robinson • February 26, 2026 9:43 AM

@ Bruce, ALL,

Predictable is not Random and Random is essential to AI function

We used to call the Current AI systems “Stochastic Parrots” implying a uniform random selection probability for phrases.

If an LLM system can not “do random” for passwords then it calls into question the “random selection of phrases”. Which calls into question the rest of the LLM usage.

Which calls into question the use of LLMs all and their other uses for which LLMs have been suggested…

jm • February 26, 2026 10:00 AM

But if AI agents are doing things autonomously, they will be creating accounts. So this is a problem.

If that were the extent of the problem, the solution would be simple: delegate to a tool that uses a properly seeded CSPRNG to generate passwords when needed (as Patrick suggests).

The real problem is that any credential that is exposed to the model’s context becomes vulnerable to subsequent extraction via prompt injection. And even if you isolate the credential in tool configuration, a prompt-injected agent is still a Confused Deputy.

Rontea • February 26, 2026 10:38 AM

Large language models, by design, optimize for pattern recognition and human-like output—not for entropy. When tasked with generating passwords, they produce predictable sequences and systematically avoid certain characters, creating a security liability for autonomous agents. This isn’t just about weak passwords; it’s a symptom of a deeper problem: authenticating non-human actors in a system designed for human credentials. Until we rethink how these agents establish trust, we’re layering brittle automation onto brittle security assumptions.

Clive Robinson • February 26, 2026 11:37 AM

@ ALL,

A part of the quote from the article says,

“There are no repeating characters within any password. Probabilistically, this would be very unlikely if the passwords were truly random”

Is actually not technically not true.

There are two degrees of freedom in a random sequence, value and order position.

We normally think about the value being random, but actually we are more used to the order / position being random.

That is think of a pack of cards, there are 52 unique cards each has a different “value” there are no repeats. When we “shuffle the pack” if we do it properly then the order the cards are in is “random” but there can not be any repeating values.

This statement from the authors of the article gives me pause to think about what else they have written that may be wrong… because they actually should know this, but either they do not or have chosen to not mention it…

There are quite a few “Card Shuffling” algorithms perhaps the most well known for various reasons
is RC4. Along with our hosts @Bruce’s “Solitaire”.

Matt • February 26, 2026 1:39 PM

I normally avoid LLMs like the plague, but occasionally I’ll test something in CHATGPT (usually trying to break it or test injection attacks). Yesterday I was able to get it to generate a response of “random words” and it continued outputting for almost 7 minutes straight totalling around 25,000 words. Way more than its ostensible token limit.

Even funnier is that it kept generating the same set of 50 or so words over and over, thousands of times in a row. So the notion that it can’t generate random passwords is quite believable.

Schneier on Security

LLMs Generate Predictable Passwords

Comments

Leave a comment Cancel reply