Schneier on Security
A blog covering security and security technology.
« UK Report on July 7th Terrorist Bombings |
| Aligning Interest with Capability »
May 31, 2006
From a list of 100,000 passwords for a German dating site, we learn that 123456 works 1.4% of the time and that 2.5% of all passwords begin with 1234.
Posted on May 31, 2006 at 2:17 PM
• 57 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Apart from the '1234's, the other common passwords listed almost all appear to be (modulo my rather limited German), unsurprisingly, swearwords, personal names and/or proper nouns.
I'm slightly intrigued by the inclusion of 'Frankfurt', though - is the dating site connected with that city, or is it the Germanic equivalent of the Marx Brothers' observation that "The password is always 'swordfish' "?
I don't think the site is connected to that city, but frankfurt and the sourrounding cities are quite large. So perhaps some people just picked the password because they live there.
I wonder how many '123456' accounts are created by advertising bots.
One recalls that "12345" was the combination to Planet Druidia's air shield and President Skroob's luggage in the movie SpaceBalls.
Funny -- on the sites I run, "password" is used for about 1/2% of all accounts...
(And no, we don't store raw passwords; we store hashes, but the hash of "password" doesn't change. Good argument for something like time-dependant salting or...)
uhmm, maybe I should change my password... :-)
I am running a german-language (non-technical) forum with phpbb, which also stores passwords as md5. For fun I looked at how often those passwords were used by our users (several thousand):
frankfurt (just a single user, rounds to 0%)
I also did a quick group by and indeed 123456 is (by a margin) the most common password.
It seems to me we shouldn't overlook the possibility that users of dating sites might be somewhat less scrupulous in their password selection than they are when setting up an account with an online banking site.
There are many sites we're you can decrypt md5 hashes, and they should know that md5 is so weak. I always use blowfish 448bit encrytion for passwords stored in a database.
About the common passwords: The way to omit this problem is to build a password check that shows the strength of the password. And to only accept combinations of different letters and numbers. But in this case (md5) futile.
In my opinion one should never let a user pick a password. Better to construct one with random chars, and encrypt them properly.
Don't forget the French. I have been told that the most common password in France is "bonjour".
It would be interesting to know what the more common passwords are in American English and British English.
Amusing but not necessarily meaningful. No reason to choose a strong password for an online dating site.
How is Blowfish stronger than md5 as a password hash? Building rainbow tables (list of hashsums for common passwords/dictionary entries) should be about the same on decent hardware. And MD5 isn't "broken" (hash => password) yet, afaik.
The only measure that helps in my opinion is something like salting that creates different hashes for the same password.
With md5 decrypt i meant: comparing. sure can be decrypted, but not what i meant in my post. heh...
Well, identity theft can be a serious threat if all information about you is being logged, including bankaccounts and permissions to withdraw money from it if they given permission in a form of a "status" in a database field. What to think about chat history which is common to be logged, can be tools for blackmailing people? And the password they use they can be using this also for their emailaccount, and maybe paypal.
Information is power, even how little you've got from it.
md5 "so weak"? For a dating site??
If you assign random passwords, you ensure that they will be forgotten instantly, thereby increasing your support costs, or written down instantly, thereby increasing your support costs.
Get's confusing when i made that mistake *sigh* comparing them to online tables is what i meant, but to make MD5 collisions is also possible, but takes more time. what i meant by not being secure enough and i consider it too weak. For me its the same by storing it verbatim in a table.
Standard i always use BF44bit encryption, where "real human information" is being stored. The protection of information is one of my highest priorities in designing code, to ensure a strong level of safety for the users and for myself. No matter how insignificant the endusers application would be. It maybe paranoia, but that's the way i work.
I guess my question would be why is it so easy for these guys to decrypt the password?
I never store the password in any system, only a hash that is well salted. Again I do not use MD5 but still MD5 well salted with an individual salt for each user usually based on user name, email, and the Unique GUID generated for each user even if 2 users had the same password they wouldn't have no where near the same hash. I dunno I guess I just have issues that some admins have the ability to decrypt users password if they even are encrypted. Doesn't anyone else seem to have a problem with this? I mean yes I could write a test for common passwords like 12345 since i know how to salt them and what the Key and IV are so I could go through every user and run a test against their hash but to be able to tell you exactly what their password is without running brute force. I don't I kind of look at this and see a bigger issue than thier beeing common passwords, the issue is they can tell that.
Why are these systems storing passwords as plain hashes? I sincerely hope nobody hacks into those systems. It's so much easier to figure out a password when you can do so for all accounts at once rather than one at a time.
while it is most likely that this German dating site simply stores their passwords in the clear to simplify code and support, it is possible that they do in fact store their passwords securely using hashes and secure salting techniques.
To generate this data one would only need to have to have a system tracks expired passwords in cleartext or something that tracks frequency of use of these passwords as they are stored.
Securing this tracking data is then the problem or if cleartext password storage is used, then you just need to trust the admins. None of these are big leaps for most large organizations when non-monetary applications are involved.
Yeah, that's highest fear i always have. And down to the weak link, which is the admin who has root.
If you have a favorite equation from chemistry, mathematics, or physics, using it for a password is a good way to get the top row of your keyboard involved without sacrificing memorability.
You may not be able to retrieve the original password (MD5 Hash -> Password), but you can retrieve a password equivalent.
In other words, if you can find a string whose hash collides with some unknown string's hash, your string can substitute for the other string wherever only the hash of the string is used (ie password verification).
In most MD5 password systems, you don't need the original password. All you need is a password string whose hash is the same as the user's password's hash. Voila, you can log in as them.
Of course, it's better to retrieve the original password, because the likelihood that the user used that password for multiple purposes is pretty high.
I know , thats why i'm more ahead then when i use MD5. So for now it is securer, and that should be my point.
> I guess my question would be why is it so easy for these guys to decrypt the password?
I don't read German, but just eyeballing the results I'm guessing that they don't decrypt the password at all, they're just running dictionary attacks against their userbase.
Password data of Flirtlife.de compromises
On the safety mailing list Full Disclosure was published in the past Sunday evening a list with entrance data by approximately 100,000 users of the Internet Dating Website Flirtlife.de. The operator reacted after own stating already at this Monday morning with the assignment of new passwords for the accounts and informed the users by E-Mail about the measure. For the reactivation of the account Turteltäubchen must set now a new password. In the own interest this should not naturally be again the old password, since it must be regarded now to be familiar.
Flirtlife managing director Matthias Kopolt said opposite heise Security the fact that it probably concerns with the list old conditions of the password data base since about half of the user names meanwhile any longer did not exist. Therefore thus at the most approximately a quarter that at present 200,000 accounts would be by the publication affected. As the account data could arrive at the public, is at present still veilful for Kopolt. Passwords were put down with Flirtlife excluding as MD5-Hashes so mentioned and are not selectable thereby not in the plain language. He assumes that the aggressors did not arrive over a weak point in the Flirtlife side at the data. One however still is in the house with the investigations entrance.
Meanwhile a view of the password list reveals interesting insights. It is interspersed in parts with HTML and/or XML similar structures. Several times arising BODY tags, how they emerge at the beginning and end of each web page, put the assumption close that them could have been the result of several collecting actions. Obvious letter turners in account names refer to the logging of repeated, unsuccessful log in attempts. Possibly the aggressors tried also to announce itself with a password from a list of particularly frequently arising passwords at as much as possible accounts.
Neglect if one tip by with account names as well as different large and lower case, the following distribution of the most frequent passwords results in the published list in the case of a total number of approximately 100,000 data records:
The strong amassment of the number sequence is remarkable 1234, with which altogether begin approximately 2.5 per cent of all passwords: A blind log in attempt with “123456��? leads in nearly 1.4 per cent of the cases to success. Likewise a too strong topic purchase or a typical pre and Kosenamen as password could become the calamity. Also much likes: Label name, sport associations and years. Nevertheless: Approximately 40 per cent of the passwords emerge only once, a majority of it are unforeseeable combinations of letters, numbers and special characters. To it Flirtlife users should take themselves an example of the new password.
@Carlo Graziani -
While using a favored equation is far better than 123456, anyone who knows a bit about you could look up the 10 or so most favored equations for your field, and try them.
I think that the optimal way to create a password is to make a randomly generated one.
Some additional information from the article for all none-german speakers:
A file with the 100.000 passwords appeared on a security mailinglist.
The article states that the file with all the passwords shows some traces of HTML or XML (several body-tags etc), so the passwords probably have been collected in several attacks and just qickly copy&pasted into a single file.
Furthermore, it contains many passwords/usernames with typos, so it seems like the files have not been stolen from the database, but been harvested by somehow loging log-in attempts.
Wether and how the passwords were stored hashed thus doesn't seem of much relevance in this case.
Oops, Jungsonn did a faster and more thorough job. :-)
MD5 is old news.
You should always use MD11.
'cause, well, it goes to 11!
At one place I worked a very high percentage of the passwords were "changeme". Most of those were in marketing. Over 45% of the userbase fell to crack on old 1998 hardware in under an hour. This included almost all of the IS management, HR, and engineering management. Only HR fixed their passwords quickly. I was later told to ignore the results for "liability reasons". (Liability is the first copout of management.) People use bad passwords normally, but so do people who should know better.
What I found interesting is the fact that the file had multiple entries of the same account/password combination: total of 343062 entries but only 111773 uniq entires. I would guess the guy had some kind of trojan horse (in the original sense) that logs all valid user/password combinations on login. So everytime someone logs in a entry is added to the log. Just a thought for the guys who have to audit the system.
It is of course true that any non-high-entropy password could be guessed, in principle, particularly with knowledge of the individual.
However, as usual, security calculus involves a balance of threat level against loss of convenience.
If the pass(word|phrase) is protecting information of great value (Trump's bank records, illicit love letters, spy names, missile launch codes, etc.) then there exists an incentive for determined, able, and well-equipped opponents to spend unbounded amounts of time attacking that authenticator, so it better be high-entropy.
If, on the other hand, there is no reason for anyone to believe that there is sky's-the-limit-value data to be obtained, then it is reasonable to assume that the password won't have to resist a high-level personalized attack, since potential attackers are presumably conducting cost-benefit analyses of their own. Resistance to standard dictionary attacks are all that is required of a password protecting such data. The benefit of not having to remember 50-digit alphanumeric soup is well worth the small security risk thus incurred.
Incidentally, the attractive feature of equations is that unlike dictionary words, there is no single correct way to "spell" them. The variable names can be arbitrary, and many operators and relations can be represented in multiple ways in ASCII. And, there are a *lot* of them, if you think about it, many of them deeply obscure. A dictionary attack would probably not be a high-percentage shot.
Hmm i keep reading:
"What does it matter it is a dating site" kind of tone. Let me say that phishing attacks are solely based on that argument. If they can steal 5$ from 1 million people with a few mouse clicks and good phishing scheme, where near the same figure them when we hit an over protected vault of a bank.
And no one misses 5$ and is too little for the police to start an investigation.
I can imagine when one signs up to a dating site, all personal information could be logged. Email, password, (which maybe used in cracking their email, paypal account, and to reach further and deeper. And maybe to steal their complete online identity, to start other accounts on different websites with their information, and they get billed for it. IMHO one must never underestimate such acts, or systems not build to protect your data enough.
You also should not that the second most used password was 'ficken' wich is the 'fucking' (dating site should be seen as an ephemism I guess).
So the password is very close to the 'purpose' the people attribute to a site.
Over all this proves once more that the responisbility to choose and protect a password can not be placed in the hands of avarage people. Complex passwords on the other hand tend to be written down. And I won't even begin to mention the implications of social engineering.
There is no real solution to this problem (two factor authentication is difficult do do for websites). I think we just have to deal with the fact that passwords are nothing that will stop someone in the end.
Maybe a dating site is a bad example, _but_ it shows how much thought the average person spends on passwords and the implication of weak passwords: zero.
yes that's the outcome of all this, and i'm pretty amazed about the simplicity of password choice. And yes: In the end one can do little to stop one with real intentions of accessing the system.
At least there are a few counter measures one can make.
The list i've come up with:
1. Create passwords, chosen by a script.
2. Use stronger encrytion, or salting MD5
4. Build a timeout function.
5. Build a script which allows a maximum try, say about 10 times, and then the account will be locked.
6. Never, ever send password through sendmail or other mailscript.
7. Signup through an ssl connection
I see a lot of people surprised at weak security at this site, but I'm not surprised at all. Several times recently I've signed up to a new site, only to have them send a "confirmation" email back to me **including my password in clear text**!!!
When this happens, I generally send a nastygram back to the site, asking that they stop this practice, and to remove me from the user database, because I don't want my info on a system maintained by people who obviously have no clue about security.
I'm not that tech savey to understand much of what is said above. Is there any big problems with using a program like password safe? Thanks... ob3
I'm not that tech savey to understand much of what is said above. Is there any big problems with using a program like password safe? Thanks... ob3
Good points. I tend not to think of equations as particuarly secure simply because my personal memory for equations is far better than it is for words.
I guess my only other note is that I'm assuming an attacker with knowledge of the person in question, whereas you appear to be assuming otherwise.
just becuase blowfish doesn't have rainbow tables doesn't mean anything! it will have soon!
>> No it won't. The new blowfish key schedule based hashing algortihm has a 128 bit salt. Storing a single bit of the hashes for even ONE password is unthinkable.
I see two problems with these percentages:
- The first has already been talked about is that the person who got the PW list might have used a dictionnary or something and thus might have missed some common passwords (maybe unlickely).
- The second one is that my own experience shows me that a large ammount of PW are a function of the username. I still didn't find any real statistical data about that. Anyway, using the username as the password works like 10% of the time if the site doesn't forbid it.
Btw, I also *think* that I managed to devise a few rules of thumb to find which users have a better probability to have a weaker password based on their username.
>>No it won't. The new blowfish key schedule based hashing algortihm has a 128 bit salt. Storing a single bit of the hashes for even ONE password is unthinkable.
Is this true?
if it is i did not know, but happy about it for sure and building pretty safe:)
There is a posting on modernlifeisrubbish.co.uk which shows popular passwords in britain. The posting is dated 5/26/06.
monkey and qwerty are pretty funny
Re: the list
Where does one find that list of 100K passwords? Or, did I miss something?
re: oB3 - When this happens, I generally send a nastygram
Or, you could just choose to use a disposable username and password and assume that they will leak out somewhere, somehow.
Of course, this begs the question about how careful the site is. Some sites track the IP of each user.
Then, you use a proxy server to isolate yourself.
Of course, some sites "foot print" your Web browser as part of the identification and authentication and tracking process.
Then, you use an outbound proxy that fixes all of this...
And so on.
Re: URL for the German page
ooops - Jungsonn posted it already (post: "Translated Version")
Wait, wait, waaaaaaaaaitaminute there!
Correct, i placed as an option. To enable clientside encryption form the form to the serversided proccessing script where other encryption can take place. To only build a relative level of security through obscurity while posting. It is obvious that it should not be a standalone app. ;)
I divide sites into those for which passwords are an annoyance for which I cannot imagine a security purpose, such as newspaper sites, and the like, which require logins and passwords; and into those which have an apparent security purpose, like my financial institutions, and so on.
For sites that have no information that I want to keep confidential, I use whatever word or phrase is convenient.
But for other sites that really do have information I want to keep confidential, I use random passwords.
I have a bunch of dice that I throw repeatedly, and I record the points. I also have a 6x6 tableau consisting of A-Z and the ten digits, not necessarily in order. I use the points on the dice as x,y coordinates, and when I get enough for a long enough password, I look up the coordinates in my tableau. That's my password, and it is different for each important web site, so that if my bank betrays it, my other sites are not compromised.
If the web site requires mixed caps and lower case, I toss a coin for each character I generated. Heads, the character is a cap; tails it is lower case.
For years folks have looked at me in horror when I suggest this system. It's too hard to remember. But it turns out, it isn't. Practice a few times, and the random password is committed to one's unconscious memory. Even if I don't know what the next character is, my fingers do.
Carlo Graziani writes very sensibly.
What is the asset that is being secured by the password?
If it's an asset of no value to you, why take effort to secure it? If the benefit to the attacker is zilch, why would he bother? And so there is low a priori probability of attack.
So what information does this German dating agency require that is a "valuable asset"? If all one types in is part of the advertisement of one's own personal attractiveness, you want it publicised anyway. Does the password secure knowledge of credit/debit card payment? Do users believe there is a serious probability of someone changing their advert to make them look silly or less attractive?
Like John "I divide sites into those for which passwords are an annoyance for which I cannot imagine a security purpose, ..."
Unless the information on poor passwords is related to the value of the asset protected, perhaps it's that the users know more about cost-benefit analysis of security than does the author of the article.
> perhaps it's that the users know more about cost-benefit analysis of security than does the author of the article
But if the password will be published, it will be linked to your name and photograph, so it might nevertheless be wise not to pick "ficken", "fuck", or "hitler".
BTW: to see the complete password list, just search full disclosure for "flirtlife.de"
It appears like the "1234..." passwords are more prevalent as computer/LAN passwords in the working place. People feel that a password for their office computer is unnecessary, and hence the result.
This is an interesting discussion, but I think it is missing the idea that you should select security protocols to fit the situation. I find it annoying when sites select a password for me. It is usually difficult to remember.
Often this is totally unnecessary. How secure does a password I use to post on a forum really need to be? The likelihood of someone attempting to hack the password is low. If someone does gain access the damage they can do is limited. To me, the loss of convenience is not worth the increased security in this case.
On the other hand it would be in the case of a login to a bank account. It is more likely to be targeted and would cause more damage if cracked.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.