Measuring Entropy and its Applications to Encryption

There have been a bunch of articles about an information theory paper with vaguely sensational headlines like “Encryption is less secure than we thought” and “Research shakes crypto foundations.” It’s actually not that bad.

Basically, the researchers argue that the traditional measurement of Shannon entropy isn’t the right model to use for cryptography, and that minimum entropy is. This difference may make some ciphertexts easier to decrypt, but not in ways that have practical implications in the general case. It’s the same thinking that leads us to guess passwords from a dictionary rather than randomly—because we know that humans both created the passwords and have to remember them.

This isn’t news—lots of cryptography papers make use of minimum entropy instead of Shannon entropy already—and it’s hard to see what the contribution of this paper is. Note that the paper was presented at an information theory conference, and not a cryptography conference. My guess is that there wasn’t enough crypto expertise on the program committee to reject the paper.

So don’t worry; cryptographic algorithms aren’t going to come crumbling down anytime soon. Well, they might—but not because of this result.

Slashdot thread.

Tags: academic papers, cryptanalysis, cryptography, encryption

Posted on August 21, 2013 at 7:01 AM • 34 Comments

Comments

Clive Robinson • August 21, 2013 9:06 AM

Yes the results do apply in some cases such as short lengths of “human” usable plain text. And this has been known since before WWII (and used incorrectly to continue the use of “poem code” hand ciphers long long after their “best before date, much to the cost of SOE agents lives).

As Bruce notes it’s current main use is to attack “human memorable” passwords and passphrases.

However some stream ciphers are known to have “slow start” issues where initial output statistics show charecteristics that can aid an attacker especialy when part of the “key” or “plaintext” is known or easily guessable to an attacker (which has happened with both network encryption and random access storage encryption).

Kevin O'Bryant • August 21, 2013 9:49 AM

FTA: “This assumed uniformity has many desirable properties, including maximum obfustication and difficulty for the inquisitor.”

Maybe we just don’t understand what “obfustication” means!

AC2 • August 21, 2013 10:00 AM

I did wonder what the big deal was about this, nice that Bruce clarified!

Dave Aronson • August 21, 2013 10:15 AM

@Kevin, it’s presumably derived from this:

http://www.merriam-webster.com/dictionary/obfusticated

Scott "SFITCS" Ferguson • August 21, 2013 10:20 AM

@Kevin O’Bryant
Maybe we just don’t understand what “obfustication” means!

Confusing don’t you think?

🙂 If only there was a way to look words up…

Peter A. • August 21, 2013 10:47 AM

@Kevin & Dave Aronson:

The MW wants me to open a premium account to look the word up, but Urban dictionary is more friendly:

http://www.urbandictionary.com/define.php?term=obfusticate

I just wonder if the two sources agree on the meaning…

Martin Walsh • August 21, 2013 11:00 AM

Crypto without Information Theory? That is really, really dumb. All this does it strengthen the image of crypto “experts” as people who make stuff like “the bits go in here and then they come out here and then these things go around and then …” because that’s all you have. A box of gears. No wonder encryption has dragged all of crypto into this stupid world of “oh, you can never be sure, no one can ever really, really know for sure.” So, you design another box of gears and then have to wait twenty years to see if anyone finds anything wrong with it, because you don’t understand Information Theory. Dumb. Thanks for pointing that out.

Thecaseforpeace • August 21, 2013 11:09 AM

I’m far more worried about the NSA’s password guessing capability rather than their mathematical attack on the algorithms. I think passwords will always be the weakest link so long as human memory stored passwords are used.

It seems likely that the NSA has made an art of attacking password and passphrase entropy.

HJohn • August 21, 2013 11:41 AM

When I taught a related class some years back, I illustrated encryption by comparing it to locks.

If your key is under the mat, the strength of your lock doesn’t matter. They don’t have to pick it.

Similarly, if your passcode is on easily guessed or on a sticky under your mouse pad, the strength of the encryption doesn’t matter. They don’t have to crack it.

Just yesterday, I downloaded a PDF of my 2001 master’s thesis that is still on my graduate university’s website. I had copy and print locked down to make it harder (not impossible) to copy. I could not remember the password. I had an idea…. I used a PDF printer to append a blank page to it, and all the functions were available in the new document.

Did I crack the password? Did it beat the encryption? No, I simply bypassed it based on a flawed implementation. Similar concept.

Best,
HJohn, CIA, CISA

Cinderella • August 21, 2013 12:05 PM

Thanks for clarifying for us noncryptographers.

melvin goldstein • August 21, 2013 1:46 PM

Numbers are the Supreme Court of science. However Godel proved that we may not prove everything. Science needs numbers. There must be Science and Physics Foibles!!

Gweihir • August 21, 2013 4:17 PM

I second the guess about the missing paper reviewer competence. Often researchers of dubious quality resubmit their papers to conferences only somewhat related after having been rejected by conferences where the reviewers actually understand the subject matter. Unfortunately, this strategy is often successful.

Over the years, I did reject a number of security and crypto papers submitted to a P2P conference, obviously submitted in the hopes no reviewer with the required expertise would be available. On the other hand, I could recommend very, very few papers in these subject areas for acceptance at this conference. Almost all were submitted in the hopes of not being recognized for what they were, namely bad, shoddy, insightless and sometimes plain wrong or fraudulent “research” papers.

Dirk Praet • August 21, 2013 7:34 PM

Thanks for clarifying, Bruce. Seems I’m not impacted too much since I stopped using “human memorable” passwords and passphrases a long time ago.

Aspie • August 21, 2013 7:41 PM

So even if I choose a password like “yN76vx4HVf” that’s not going to help?

I can remember these kinds of passwords and have about 30 in my head for various accounts, all generated using a simple algorithm. I can’t (and never do) write them down because I’ve learned to touch-type them in so I probably couldn’t be hit around the head to reveal any of them unless there was a keyboard in front of me.

Now, I’m probably confused (it is quite late) but is the point here that by not using Shannon entropy then passwords based on/containing dictionary words weaken the subsequent encryption rounds or the digest produced?

Gweihir • August 21, 2013 8:41 PM

@Aspie: The way I read this is that as long as your input (or password) is random and uniformly distributed, Shannon Entropy does apply.

After all, a dictionary attack against a random password is pretty useless. But people that routinely attack passwords (law enforcement, criminal hackers,…) have been very successful with dictionary attacks against passwords, as most people cannot handle random passwords. This may have sparked this piece of “research”.

I think the paper argues that if text is non-uniform, it becomes easier to attack in a ciphertext-only scenario (no surprise there….). If so, then this “research” would state the obvious as this has long been known and is pretty intuitive.

One typically used countermeasure is to compress before encryption so as to make what is encrypted closer to uniformly distributed. (Turns out that is not always a good idea. If the attacker can influence part of what gets compressed, he can make guesses about the rest of the data from the compressed size. See the “BREACH” attack against SSL. Also not really that much of a surprise to anybody that knows how data-compression works…)

gonzo • August 21, 2013 8:51 PM

@aspie…. You’ve got about 45 bits of entropy. Not enough.

I’ve come to rely on a combination system involving a 10 character token like yours, then a series of words including a foreign language word and a slang word, with some .

So for example, you might say:

yN76vx4HVf_krUnk_pOpsicle_dEcorazione!@#

That example there is over 200 bits of entropy and is a 40 character password, the max length for some of my applications.

If you remember the token at the front (or back, or middle, hey mix it up), then the rest is easy to come up with by way of mnemonic and convention tricks. For example, you may have a rule that for passwords created in 2013 your “filler” to get to 40 characters is !@#, for 2014 it might be ~~( And, you might have a rule that your words are separated by periods, or underscores, or spaces, or whatever. And that you’ll capitalize the first vowel of each word.

The point is that you’re frustrating brute force by forcing a searcher to try not just upper and lowercase letters and numbers (62 alternatives per password character), but you’re adding symbols, spaces, etc.

And the LENGTH of the password is super important tool.

Also mixing english words with foreign language words and slang words hampers dictionary searches.

Would you do this sort of thing for every password? No. But for highest security, you ought to have a system where you can both easily remember the password, and where an attacker cannot take any shortcuts to it.

Then you get to sit back and worry about the strength of the encryption algorithms!

Michael Moser • August 21, 2013 11:21 PM

I don’t know; William Tutte cracked the Lorenzo machine with his bare hands; all based on the frequency of repetitions within the cipher text.

http://www.historylearningsite.co.uk/william_tutte.htm

I any event, one can send a key to pseudo random number generator in the start of the encrypted message, and use the output of the thus seeded pseudo random number generator and xor it with each byte of the cipher text. This way there will be very few patterns to discern in the cipher text.

Michael Moser • August 21, 2013 11:42 PM

… also by hacking the Lorenzo cipher the allies got the German order of battle for Kursk and D-Day. So this was much more epic than the Enigma. and this all just by looking at regularities in the cipher text.

Kevin an Auditor • August 22, 2013 12:09 AM

As we’ve strayed back onto secure passwords (which Bruce covered in June: http://www.schneier.com/blog/archives/2013/06/ ). When I was too busy to post my own method.

In ancient Greece, orators memorized long speeches by walking their home or garden on a specific route, and assigning each piece of furniture or plant a point or topic to be covered. This can be adopted to generate very long, “memorable” passwords. It need not be one’s home. It could be an office or route to work. Further, the items may be alternately assigned numerical values, transposed (alpha or keyboard,) or summed and assigned a keyboard value. There are a lot^ of possibilities.
Example:
Copier (where A=27) 29+41+42+35+31+44 =222
Table (transposed -CAPS) = SZAKD
File Cabinet = total 372 (keyboard shift) =#&@
move and repeat (except next transposition is lower case +)

Salt to Taste

If you remember the items and the manipulation sequence, you can always work out the password. Eat your paperwork (I literally do).

(Also: Gain the clout in your organization to prevent any @hole from moving the furniture!)

Mike • August 22, 2013 1:30 AM

@gonzo – Small correction/clarification – I’m not sure how you’ve come up with your calculation of an entropy of over 200 bits, but for discussions sake, assuming you’ve calculated it based on the password length, this is an incorrect way to calculate true entropy. If you’re using a system that isn’t “random-character, random-character, random-character” then your entropy is provably/obviously less than the maximum attainable for any given length.

You list plenty of good ways to frustrate dictionary attacks certainly, but if you’re using a system, any system, then part of your security reduces to a security-through-obscurity scenario (ie. that the attacker has no knowledge of your system).

Whether or not this is practically relevant is one thing; it is certainly good advice on how to make a [currently] strong password; just be aware that the use of systems produce vastly weaker passwords on a per-length basis than random (ie. maximal entropy per length).

stine • August 22, 2013 2:28 AM

Can you comment on Steve (grc.com) Gibson’s UHE PRNG

Clive Robinson • August 22, 2013 2:41 AM

@ Aspie,

So even if I choose a password like “yN76vx4HVf” that’s not going to help

Sadley not realy, the entropy is very low and could be as little as 12bits if the keyboard used is “4 row” Qwerty as commonly found on mobile smart phones from the likes of Motorola “rock range”.

A few “4 row” keyboard observations will show why,

1, The keys are centered around the “easy middle” third.
2, 40% of The keys are from the upper alpha row
3, 40% from the lower alpha row
4, Only 20% of keys from the middle alpha row.
5, 0% from the space row.

When the ten key password is looked at as a row pattern you get “uluullumlm” which is UL, double U double L and U postfix M L postfix M. Which whilst easy to remember is also easy to guess. Likewise the use of letter&number shift is symetrical on the ten letter password as “-lnn–nll-” and the two shift blocks are inverses of each other. And yes there are other visable structures in the choice of keys when you look at columns such as the first key is the middle collumn its then right,miss,left,left, double left,right, double right,left,left can be viewed in “easy middle” thumb moves.

It is “human” regularity like this that blows the “unpredictability” out of the window, and makes guessing oh so much easier. The recorded history of the “Ultra Story” of Enigma usage shows “cillies” and other lazy operator behaviour gave very good “tips” as to the settings to try first.

Also you can be sure that if you are using a Smart Phone your network and thus the likes of the NSA know exactly which one so know the exact keyboard map and the likely two thumb human behaviour on it…

Winter • August 22, 2013 3:19 AM

“In ancient Greece, orators memorized long speeches by walking their home or garden on a specific route, and assigning each piece of furniture or plant a point or topic to be covered.”

Read “Moonwalking with Einstein: The Art and Science of Remembering Everything” by Joshua Foer

The “mental athletes” memorize stacks of playing cards etc. That makes for nice passwords. Shuffle a full stack of cards, memorize them and you create a password that is “guaranteed” to be log2(52!)~225.581 bits strong.

Obviously, it will be rather long and it will leak at some time. And it is an awful lot of work.

Layer_8 • August 22, 2013 4:31 AM

Remarque for the password discussion:

For every single application and service you use, you don’t know how secure your passwort is stored or handled. So you need different passwords and depending of the needed security it has to be long and complex enough.

Nobody is perfect, so it may happen that you tried to login with the wrong password. In this case please keep in mind, that the other side now has a working password of another system/app you use, too.

If you keep in mind the many techniques used to create profiles of your behavior (online and offline) and what online services you use, you should change the falsely used password as soon as possible (and keep your fingers crossed that it hasn’t be misused meanwhile).

If I would reveal the password of a concrete person (without access to the its client-system) and if I could access the list of any search request of this person (google, bing, etc.) this could help to limit the set of dictionaries to use for attacks, I guess. Even a person how lies professional reveals its true intentions/needs while using google & co because it’s counterproductive to lie on a search engine.

Clive Robinson • August 22, 2013 5:12 AM

@ Stine,

With regards Gibson Research’s “Ultra High Entropy” PRNG ( https://www.grc.com/otg/uheprng.htm ).

It is claimed that it passess the George Marsaglia randomness Diehard tests ( http://en.m.wikipedia.org/wiki/Diehard_tests ) which is the minimum it should do to be considered as a CS-PRNG candidate, but is not of it’s self sufficient as the algorithm needs to be examined as well.

I’ve not looked at the code but it’s claimed to be based on “Latin Squares” which are a simple construct that has been around for thousands of years ( http://en.m.wikipedia.org/wiki/Latin_square ). A simple everyday example of a Latin square being a compleated Sudoku grid. Latin squares do unfortunatly have some deficiencies which might show up in the generator output whilst still passing randomness tests.

The 2^1535 size sounds large but compare that to ARC4 of 2^1700, and other stream ciphers and it’s actually not that large. It needs to be remembered that a stream cipher only has one output sequence unlike a block cipher that can be used as a stream cipher so comparing a maximum of 2^1700 state size of ARC4 to the 2^128 bit width of a block cipher is not comparing apples with apples.

If you have a block cipher of 128 bits width in plain text and key what you get when you use it as a stream cipher is for each 128 bit key, a bit stream of length 128 x 2^128 bits in length (2^135) or around 2^263 in total for a single block. As we know from 3DES and other multi-dimensional matricies of block ciphers they can with care produce state sizes that far far exceed their basic block bit width states.

So UHE might have the makings of a good PRNG or it might not depending on a lot of further public analysis. Which in all probability it won’t get, but then that is the fate of by far the majority of ciphers ever designed. The most likely source of it being analysed further is as a “degree/Masters project” or if it has “interesting” weaknesses as somebodies PhD thesis, or if realy bad as set “course work”.

Gweihir • August 22, 2013 3:44 PM

@Clive Robinson & Stine: For crypto use you also have to make sure the generator does not leak its state by its output. For example a simple PRNG using x_n = f(x_n-1) and using x_n both as output and as next state, the complete state leaks on one output of x_n. Even advanced generators like the Mersenne Twister (about 2,5bK state, making the Gibson thing look rather pathetic) leak their complete state in a not too long output sequence. Once you have the state, all future output becomes predictable.

A proper CPRNG must never leak its state. This can be accomplished with small state generators (e.g. the 128 bit state used in the OpenSSL CPRNG) and with large state generators, like Yarrow and others. The main advantage of a large state generator is that it is far more resilient against bad initialization, i.e. initialization with low-entropy bits, which makes it easier to use correctly. A large state can also help to reduce leakage via timing.

Clive Robinson • August 22, 2013 4:54 PM

@ Gweihir,

Leaking state is an issue that as you say is of importance and can be done in a multitude of ways (many of which I have little doubt have yet to be discovered hence my comments in my last paragraph).

Personaly “Latin Squares” and their ilk make me nervous due to the fact that they significantly suffer from the reducing set issue of “dealing OUT a pack of cards”, which without care will leak sufficient state information to render a generator based on them insecure in short order (ARC4 avoided this issue by only shuffling not dealing and using what in effect was a simple one way output function).

As I said, I’ve not looked at the algorithm so I’m keeping an open mind on it, but based on previous open stream cipher design (see NESSIE with amongst others SNOW) the probability of it making the grade of CS-PRNG is low.

kodlu • August 22, 2013 5:18 PM

The results are not new. They are more of a reinterpretation of guessing entropy and renyi entropy results from pliam, arikan, arikan & merhav, sundaresan, boztas and so on. they apply to sequential key guessing. pliam did publish in traditional crypto venues and yes university publicity departments have declared the results groundbreaking. I dont know what that makes the earlier results some of which are more general and discuss non i.i.d. sources, noisy guessing, distributed guessing etc.

Aspie • August 22, 2013 5:31 PM

@Gweihir @Clive @Dirk @gonzo

I appreciate the insights/clarifications from you all.

Longer passwords are clearly better & although I’ve restricted the alphabet I use I’m not sure how an attacker would know that as @Mike notes. Not that the obscurity is an excuse, of course. Seems I have some reading to do this weekend.

Might start with a book by some chap called Schneier. 😉

@Kevin
Years ago I enjoyed Twain’s fine essay How To Make History Dates Stick. I think this is inspired similarly.

Mike • August 22, 2013 8:08 PM

@aspie

My post was more around the measuring of/definition of entropy, rather than any practical advice. By all means, use a system if you wish, a long password with a complicated system is probably going to be better than a short password with a simple system. Just understand though that password security is first and foremost a function of how many possible passwords exist in your system, rather than having anything to do with the length of the password.

Here’s an example:
An 8 character password consisting of 8 random uppercase/lowercase/digits produces a total of:

(26+26+10)^8 = 218,340,105,584,896 possible passwords.

An 8 character password consisting of a 7 letter word in the English dictionary + 1 digit produces a total of:

(12500*) x 10^1 = 125,000 possible passwords.

(*From my sources, there appear to be approx 12,500 7-letter words in the English dictionary – feel free to replace this figure with your own calculation/source though.)

To give you a sense of scale, if it would take a computer 100 years to crack the first password, it would take the same computer 1.8 seconds to crack the second one.

Even if we use random uppercase/lowercase letters in the word, we still only get a mere 6,125,000 possible passwords (approx 1.5 minutes going off our imaginary timescale)

Practically speaking, if someone were to try to crack a long password, it’s true that their password cracker would need to have knowledge of the system used. That being said, given the vast improvements in cracking time that knowledge of a system gives you, just about every system you can think of has been implemented in a password cracker somewhere. 1337-5p33k, random capitals, backwards words, the first letter of each word of a common phrase, famous places, birthdays of famous people, old cars, etc.

Hence why I would advise against having your security rely upon the notion of someone not knowing your system, and solely on the number of possible passwords your system can produce.

stine • August 23, 2013 6:07 AM

re: @stine

Thanks for your replies. I now know much more and understand much less…and so it goes.

Aspie • August 23, 2013 7:18 PM

@Mike

That makes a lot of sense, thanks.

Perhaps I now look to the quality of digest algorithms, for these are what I assume our carefully typed bits of poetry/nonsense are fed into, to produce a hash swirled enough that the chances of a collision are low and in which the maximum entropy is preserved.

Seems a tallish order.

William Lee • August 26, 2013 11:28 PM

So maybe not ‘entropy’ as such, but what about the recent story that the NSA has been intercepting encrypted video conferences of the UN, and have broken the encryption as far back as 2012?
http://www.aljazeera.com/news/americas/2013/08/2013826225510369341.html

What kind of encryption is the UN using, and what does this mean for other forms of encryption?

Andy • August 28, 2013 2:29 PM

bruce++

Schneier on Security

Measuring Entropy and its Applications to Encryption

Comments

Leave a comment Cancel reply