### Cryptanalysis of SHA-1

On Tuesday, I blogged about a new cryptanalytic result—the first attack faster than brute-force against SHA-1. I wrote about SHA, and the need to replace it, last September. Aside from the details of the new attack, everything I said then still stands. I’ll quote from that article, adding new material where appropriate.

One-way hash functions are a cryptographic construct used in many applications. They are used in conjunction with public-key algorithms for both encryption and digital signatures. They are used in integrity checking. They are used in authentication. They have all sorts of applications in a great many different protocols. Much more than encryption algorithms, one-way hash functions are the workhorses of modern cryptography.

In 1990, Ron Rivest invented the hash function MD4. In 1992, he improved on MD4 and developed another hash function: MD5. In 1993, the National Security Agency published a hash function very similar to MD5, called SHA (Secure Hash Algorithm). Then, in 1995, citing a newly discovered weakness that it refused to elaborate on, the NSA made a change to SHA. The new algorithm was called SHA-1. Today, the most popular hash function is SHA-1, with MD5 still being used in older applications.

One-way hash functions are supposed to have two properties. One, they’re one way. This means that it is easy to take a message and compute the hash value, but it’s impossible to take a hash value and recreate the original message. (By “impossible” I mean “can’t be done in any reasonable amount of time.”) Two, they’re collision free. This means that it is impossible to find two messages that hash to the same hash value. The cryptographic reasoning behind these two properties is subtle, and I invite curious readers to learn more in my book

Applied Cryptography.Breaking a hash function means showing that either—or both—of those properties are not true.

Earlier this week, three Chinese cryptographers showed that SHA-1 is not collision-free. That is, they developed an algorithm for finding collisions faster than brute force.

SHA-1 produces a 160-bit hash. That is, every message hashes down to a 160-bit number. Given that there are an infinite number of messages that hash to each possible value, there are an infinite number of possible collisions. But because the number of possible hashes is so large, the odds of finding one by chance is negligibly small (one in 2^{80}, to be exact). If you hashed 2^{80} random messages, you’d find one pair that hashed to the same value. That’s the “brute force” way of finding collisions, and it depends solely on the length of the hash value. “Breaking” the hash function means being able to find collisions faster than that. And that’s what the Chinese did.

They can find collisions in SHA-1 in 2^{69} calculations, about 2,000 times faster than brute force. Right now, that is just on the far edge of feasibility with current technology. Two comparable massive computations illustrate that point.

In 1999, a group of cryptographers built a DES cracker. It was able to perform 2^{56} DES operations in 56 hours. The machine cost $250K to build, although duplicates could be made in the $50K-$75K range. Extrapolating that machine using Moore’s Law, a similar machine built today could perform 2^{60} calculations in 56 hours, and 2^{69} calculations in three and a quarter years. Or, a machine that cost $25M-$38M could do 2^{69} calculations in the same 56 hours.

On the software side, the main comparable is a 2^{64} keysearch done by distributed.net that finished in 2002. One article put it this way: “Over the course of the competition, some 331,252 users participated by allowing their unused processor cycles to be used for key discovery. After 1,757 days (4.81 years), a participant in Japan discovered the winning key.” Moore’s Law means that today the calculation would have taken one quarter the time—or have required one quarter the number of computers—so today a 2^{69} computation would take eight times as long, or require eight times the computers.

The magnitude of these results depends on who you are. If you’re a cryptographer, this is a huge deal. While not revolutionary, these results are substantial advances in the field. The techniques described by the researchers are likely to have other applications, and we’ll be better able to design secure systems as a result. This is how the science of cryptography advances: we learn how to design new algorithms by breaking other algorithms. Additionally, algorithms from the NSA are considered a sort of alien technology: they come from a superior race with no explanations. Any successful cryptanalysis against an NSA algorithm is an interesting data point in the eternal question of how good they really are in there.

For the average Internet user, this news is not a cause for panic. No one is going to be breaking digital signatures or reading encrypted messages anytime soon. The electronic world is no less secure after these announcements than it was before.

But there’s an old saying inside the NSA: “Attacks always get better; they never get worse.” Just as this week’s attack builds on other papers describing attacks against simplified versions of SHA-1, SHA-0, MD4, and MD5, other researchers will build on this result. The attack against SHA-1 will continue to improve, as others read about it and develop faster tricks, optimizations, etc. And Moore’s Law will continue to march forward, making even the existing attack faster and more affordable.

Jon Callas, PGP’s CTO, put it best: “It’s time to walk, but not run, to the fire exits. You don’t see smoke, but the fire alarms have gone off.” That’s basically what I said last August.

It’s time for us all to migrate away from SHA-1.

Luckily, there are alternatives. The National Institute of Standards and Technology already has standards for longer—and harder to break—hash functions: SHA-224, SHA-256, SHA-384, and SHA-512. They’re already government standards, and can already be used. This is a good stopgap, but I’d like to see more.

I’d like to see NIST orchestrate a worldwide competition for a new hash function, like they did for the new encryption algorithm, AES, to replace DES. NIST should issue a call for algorithms, and conduct a series of analysis rounds, where the community analyzes the various proposals with the intent of establishing a new standard.

Most of the hash functions we have, and all the ones in widespread use, are based on the general principles of MD4. Clearly we’ve learned a lot about hash functions in the past decade, and I think we can start applying that knowledge to create something even more secure.

Hash functions are the least-well-understood cryptographic primitive, and hashing techniques are much less developed than encryption techniques. Regularly there are surprising cryptographic results in hashing. I have a paper, written with John Kelsey, that describes an algorithm to find second preimages with SHA-1 —a technique that generalizes to almost all other hash functions—in 2^{106} calculations: much less than the 2^{160} calculations for brute force. This attack is completely theoretical and not even remotely practical, but it demonstrates that we still have a lot to learn about hashing.

It is clear from rereading what I wrote last September that I expected this to happen, but not nearly this quickly and not nearly this impressively. The Chinese cryptographers deserve a lot of credit for their work, and we need to get to work replacing SHA.