SHA-1 Broken

SHA-1 has been broken. Not a reduced-round version. Not a simplified version. The real thing.

The research team of Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu (mostly from Shandong University in China) have been quietly circulating a paper describing their results:

  • collisions in the the full SHA-1 in 2**69 hash operations, much less than the brute-force attack of 2**80 operations based on the hash length.
  • collisions in SHA-0 in 2**39 operations.
  • collisions in 58-round SHA-1 in 2**33 operations.

This attack builds on previous attacks on SHA-0 and SHA-1, and is a major, major cryptanalytic result. It pretty much puts a bullet into SHA-1 as a hash function for digital signatures (although it doesn’t affect applications such as HMAC where collisions aren’t important).

The paper isn’t generally available yet. At this point I can’t tell if the attack is real, but the paper looks good and this is a reputable research team.

More details when I have them.

Update: See here

Posted on February 15, 2005 at 7:15 PM136 Comments

Comments

Rafael Sevilla February 15, 2005 8:25 PM

So what hash functions are available that don’t have a substantially similar construction? AFAIK, RIPEMD160 and the SHA256-384-512 series are of the same sort, and the attack could in principle work for them as well. There’s Tiger, which appears quite different, and Whirlpool. Any other suggestions?

This is, it would appear, a collision attack, not a preimage attack, so I guess we have some time to phase out the old hash functions.

Jordan February 15, 2005 9:03 PM

269 operations is still an awful lot of operations. What is it that lets us say that 269 is “broken” but 2**80 is “not broken”?

Randell Jesup February 15, 2005 9:19 PM

That’s 211 less operations. Let’s say breaking this (269 ops) takes the NSA a week. If it had been 2**80, it would have taken 2048 weeks, or 39 years. If it would have taken the NSA (or whomever) a year to break SHA-1 before, it could be broken in 4 hours.

My guess would be it would still take a lot longer than a week – but would now be in the realm of possibility, whereas before it would have been in the lifetime(s) range. However, this is totally a wild-assed-guess, based on the assumption that it was expected to take 100+ years before this to crack.

Anthony Martin February 15, 2005 9:25 PM

“…whereas before it would have been in the lifetime(s) range.”

Either way, it’s well within the statute of limitations for whatever crime you’ve committed. 😉

Hal February 15, 2005 9:46 PM

I don’t think any public calculation has successfully solved a problem which required as much as 2^69 work. It will be interesting to see if this motivates people to search for an actual SHA-1 collision. Exhibiting a collision always has more impact than a theoretical break.

Of course, these researchers have yet to publish their techniques. Isn’t it kind of contradictory to the spirit of academic research to keep your methodology secret for so long? It’s been six months now since their MD5 results.

Will February 15, 2005 9:53 PM

Regarding how long it should take to break… Let’s assume that a single CPU can tackle 232 ops/sec. (About 4 billion, so assuming each op is one cycle, about 4 GHz… Gross oversimplification, but it makes the math pretty easy.) So, how long would it take to do 269 ops?

2**37 seconds of CPU time. About 4000 years.

So, if you have a 4000 node cluster, it ought to take about a year, which would be well within the statute of limitations, for most crimes and jurisdictions… 🙂

Brute forcing, using the same hypothetical cluster, would have taken over 2000 years. So, I guess today’s lesson is that it isn’t completely broken, but it certainly ain’t secure.

Myself February 15, 2005 10:13 PM

2^69 is still a lot of work, with current processors and electricity prices. But with Moore’s progression and the lessons of history, people who were planning on 2^80 complexity for a bit of futureproofing will be very unhappy with 2^69.

I’m not clear on why anyone would’ve been using 80 bits in the first place. A 20% reduction in 80 bits is a big deal, but a 20% reduction in 256 bits is still way outside what we’d consider practical in the forseeable future. Bits are cheap, use lots!

Jonathan Conway February 15, 2005 10:18 PM

With the maxim about attacks getting better I’d be worried that with the rate that the SHA family attacks have improved in the last few months we could see even more serious breaks within a year or two.  Not long ago we had reason for deep concern, now we’ve got reason for outright worry.

Fuzzy February 15, 2005 10:20 PM

Jordan is correct – 2^69 is still a large data space to search.
However, as Randell points out, this is a lot better than 2^80.
Assume you had 100,000 CPUs each capable of 4,000,000,000 tests per second.
That works out to 1,475,739 seconds to find a collision or about 17 days.
It is unlikely that such equipment exists, but it gives an idea of a possible worst case.
However, many digital signatures need to be secure much longer than 3 weeks.
Think of a contract for a 30-year mortgage.
The previous brute force mechanism (2^80) might have been secure for up to 95 years and reasonable.

Arash Partow February 15, 2005 10:23 PM

How effective is this attack? For example can it change “attack at dawn” to “attack at dusk” in a file that
has been compressed and then had a sha-1 md made?
Because at the end of the day isn’t that the point of MDs?

-A curious cryptographer…

cjr February 15, 2005 10:23 PM

It is not 2^11 fewer operations. It is 2^11 times fewer operations, roughy 1/2050th the work.

The point isn’t so much that it takes less time but that it has an large and now known weakness. It is very likely other weaknesses will come to light making it useless for secure hashing.

Sean February 15, 2005 10:49 PM

oh my. its amazing how human brainpower [and some patience and creativity] can ultimately defeat ANYTHING presented to it.

Good for the chinese team! [congratulations!]

time to build something better than SHA1

Peter February 15, 2005 11:38 PM

I think it is important to note that (from what I’ve heard, I haven’t seen the paper either…) this collision attack is not very “real world” useful. Their attack is focused on taking a certain number of operations to come up with two hunks of data that result in the same hash.

In my opinion, a “real world” attack would be one which given a blob which has already been hashed, would come up with another blob which results in the same hash. To my knowledge, nobody has any useful attacks in that direction yet, although some would argue based upon this research that it may just be a matter of time.

Then we of course need to get into whether that is really useful either. If I find out that “I agree to purchase 100 units for $500” and “*(\D$Hw&72d98a %93di(hd eLKH%ap$#” results in the same hash, how helpful is that to me? How is a lawyer is going to prove to a jury that I may have actually signed the garbage instead of the purchase agreement? So, there is even more work to be done to make it a useful real world attack, wherein you might take the original signed text (modified for your evil purposes), append a null character, and then add garbage until the hashes are equal–and hope the UI was poorly written and just displays up to the first null.

Gavin Weld White February 16, 2005 12:11 AM

If this solution lends itself to distributed computation, and one in a million people online were to participate in such a project, the first publicly generated SHA-1 collision should be produced by the end of 2010.

That is assuming the use of cheap modern desktops, Moore’s law, and linear growth of online population in line with the predictions of Computer Industry Almanac (140 million new users/year, and thus net growth of 140 new participants/year — probably under-estimating growth of participation here).

If participation were higher, say one in a thousand, we’d be cracking them at a rate of one every other month.

Andrew Wade February 16, 2005 12:34 AM

269 operations is still an awful lot of
operations. What is it that lets us say that
2
69 is “broken” but 280 is “not broken”?
The 2
80 is a brute-force attack. Less than brute force means that it is “broken”, for the reason cjr gave. (“broken” and defeatable in practice are two different things). The only exception to this convention I’m aware of is in public-key cryptography.
AFAIK, all known public-key algorithms are vulnerable to less than brute-force attack. The key sizes are boosted to compensate, for lack of any alternative.

Anonymous February 16, 2005 1:17 AM

Where people take p = password

p’ = sha1(p)
or
p’ = sha1(p, nonce)

This case is reasonably safe as you’re allowed collisions in the problem space (different users can (and probably do) choose the same password)… as long as p’ is not exposed to the attacker.

The problem sets in when you use MD5 or SHA1 for digital signatures:

For example:

md5sum file

This allows an attacker theoretically to change file and compute the same hash from a different bag of bytes. This eliminates the trust you might have had in the file being made available to you.

One ISP I know “verified” downloads from a nearby mirror using a similar method. It wasn’t until I pointed out that an attacker could change the source by contributing to the application. Verification of checksums / hashes is not the be all and end all, but this break by the researchers makes it more difficult to trust the class of hashes which have been shown to be weak for verification purposes.

Andrew

Anonymous February 16, 2005 1:26 AM

Very impressive if it pans out. In some ways, the writing was on the wall with their earlier work, but you still just someone don’t beleive it’s coming. WOW!

Looking forward to a public release after they get any typos out (understandable).

Praveen February 16, 2005 1:51 AM

The design structure of all hash functions in the MDx and SHA family is based on an unbalanced Feistel network structure opearting in a non-linear feedback shift register mode which we told last year june in our new hash function design paper called CRUSH mentioning that this structure is a single point of failure for cryptography.

Regards
Praveen Gauravaram

Phill February 16, 2005 2:00 AM

SHA-1 is broken but not yet cracked. This is a compressor function collision, getting to a full hash function collision has not yet happened.

We have a couple of years (but not much more) to plan a transition to more secure algorithms.

Mike February 16, 2005 2:07 AM

Okay, could someone a bit more well-versed in the Hows and Whys of cryptography step up and explain why a hash algorithm that is “broken”, when used in an HMAC setup, is suddenly “not broken”?

Is it simply because we’re suddenly involving a secret key?

If so, could not these advances mean that obtaining that secret key may be a bit easier than we previously thought, too?

Mike February 16, 2005 2:52 AM

Digital signatures aside, I think this attack would be devestating to the current wave of file-sharing networks. The one I am most familiar with is the ed2k network, especially using the eMule client. eMule has since distanced itself from the original MD4 (yes, you read it right) used for integrity checking in favor of SHA-1. However, if even this has been cracked, there’s now nothing stopping an attacker from substituting random garbage for blocks of legitimate content…and without anyone being the wiser until it’s too late. The blocks would continue to pass virally from node to node with no way to determine whether they’re legit or not. Score +1 for the **AA’s of the world =(

Jan-Eric Duden February 16, 2005 4:51 AM

if SHA-1 is broken now,
what are the alternatives now?
Any suggestions?

Louis Cordier:
It is true that ordinary processors dont double their raw processing speed every 18 months anymore.
However, the trend goes now to multi-core processors. A multi-core processor is perfect for cracking SHA-1 since there are a lot of independent calculations to do.

Tobias Gondrom February 16, 2005 4:55 AM

I agree with my collegues above: 2^69 is still huge.

And what I might like to add is that we are talking about speculation as long as the paper is not published. Until we don’t see the paper (with all the qualifications that must be fulfilled for the attack to work), I think it is quite dangerous to discuss sheer assumptions. (although I am very exited to get my hands on this paper and nervous about the possible consequences)

Johannes February 16, 2005 5:06 AM

@Mike

However, if even this has been
cracked, there’s now nothing stopping
an attacker from substituting random
garbage for blocks of legitimate
content
It isn’t that broken. Computing power is still stopping the attackers.

And also in your scenario. If I were the attacker, I would simply tell you that my random bytes had the legitimate SHA1-value. You still wouldn’t find out I fooled you until the whole file was downloaded.

SHA1 in p2p-networks serves as a way for users to compare two files. You still have to trust the one giving you the SHA1-value. Integrety of the file is not assured.

Praveen February 16, 2005 5:08 AM

If the argument that it was attacked by 2^69 computations is true (we don’t have proof yet) then we can safely say it is broken as it is less than B’day on SHA-1. It is the time for us to look at new designs. The good start is to have a new kind of iterative structure first than Merkle-Damgard structure. So the question is what kind of abstract structure or a hash function model can resist these attacks. Once the compression function is attacked, the attack can be extended to other blocks as well with further research. So hash value and the chaning value should be different and chaning value should be more than the hash value. Interesting to see Stefan Lucks proposed structure.
The future research should proceed on these lines. We need to have secure hash functions and give importance to efficiency once you achieve security in the first instance. Anyway, the performance expected from a hash function depends on the application

Anonymous February 16, 2005 5:21 AM

Andrew Wade wrote:

The 2**80 is a brute-force attack. Less than

brute force means that it is “broken”, for the

reason cjr gave. (“broken” and defeatable in

practice are two different things). The only

exception to this convention I’m aware of is in

public-key cryptography.

AFAIK, all known public-key algorithms are

vulnerable to less than brute-force attack. The

key sizes are boosted to compensate, for lack

of any alternative.

All hashes are necessarily vulnerable to less than brute-force attack as well, simply because they are hashes. Anytime hashtext is allowed to be shorter than the corresponding plaintext, collisions must occur because the possible combinations are more finite. There is no way around this, so like for public-key cryptography, one must compensate by having longer hashtext. The perceived usefulness of modern hashes appears to exceed the perceived usefulness of the 8-bit checksum by a magnitude proportional to how many more collisions the 8-bit checksum would have in the given application. Defining “broken” as “requiring less than brute force” therefore renders the term “broken” meaningless since, in the absence of more “secure” hashing algorithms, making the hashtext longer necessarily reduces collisions. However, because when the length of the hashtext reaches the length of the plaintext, you essentially have symmetric cryptography with a known key and algorithm (unless of course the algorithm allows collisions when hashtext length equals plaintext length, currently seen as undesirable), there is a paradox where a longer hashtext is also less secure. The true usefulness of a hash is proportional to how much processing it takes to find a collision. If we assume that the more processing it takes to calculate a hash in the first place, the more processing it would take to find a collision, the challenge becomes the development of hash algorithms that take more processing power. If available processing power were to cease increasing, the practicality of finding collisions would also cease to increase, and there would be no further need or use for newer hash algorithms that take more processing power. As long as that doesn’t happen, expect each hash algorithm to be replaced periodically with a newer one that takes more steps to calculate. After all is said and done, that is really the only way to stay ahead of the game.

Victor Bogado February 16, 2005 6:50 AM

Maybe we should start encoding meta-data along with the hash, so instead of trusting only on the hash to confirm that the message is from who sign it, we would encode along the message, the size, type and whatever characteristic could define the message.

For instance, suppose I sign the message “Hi, I’m Victor”, along with the hash it would contain the size (14 bytes), type (English text), encoding (7bits ASCII) and how about the range of codes used in the messages (from U+0027 – U+0074).

A good hash would give a uniformly distributed random hash for the message, so it is safe to assume that even if we could find a collision, it would be highly unprovable that it would satisfy all the meta-data. In some cases it could be provable that this kind of hash is unbreakable, since there is a finite number of messages that satisfy the meta-data (if you could hash all possibilities and verify that there were no collisions you’re 100% safe).

Simon Steinmeyer February 16, 2005 7:16 AM

History shows that reducing the brute-force key space of an algorithm is only the beginning of the end: I am sure that the attack will be optimized and improved, so that the key space will be further reduced. This has been shown during several attacks on FEAL, too.

That means that we should not trust that in near future the key space stays at 2**69. When similar hash algorithms also shows the SHA-1 weakness, then we need an new hash algorithm nearly immediatly.

Mike February 16, 2005 7:32 AM

I’m not a cryptographer but to those who want to know why HMAC use of a hash function is not broken, it’s because, as somebody else suggested, of the key.

With a digital signature all you have to do is find another blob of data which hashes to the same hash. You are free to choose any blob of data.

With HMAC you are not free to choose any other blob of data because a secret key is always added to the data before it is hashed and you don’t know that secret key. So you still need to guess the key or the person verifying the HMAC will get a different hash than you.

(On a side-note, how the heck do I get line breaks when I post comments?)

Scott Stanfield February 16, 2005 7:58 AM

Pardon my ignorance, but what good does it do me if I can find a few collisions with a digital signature on a document? Aren’t the collisions going to be a bunch of gibberish that hashes to the same value? How would I use the gibberish to cause trouble? I can see a DOS scenario, where I replace a good message with gibberish, but I can’t see how I could massage a message to say something intelligible but different, like “deposit this in another account,” or “I inhaled,” or whatever.

johan February 16, 2005 8:11 AM

The importance is that often there is someplace in the document that you can change willy-nilly, while retaining semantic meaning.

for CRC-32, all you needed was 4 bytes in a row, and you could completely control the hash of the document. I don’t know how many are needed for SHA, but let’s say that it is on the order of 80 bytes:

  • In a jpeg, you could add a comment
  • In a MS .doc, you could add meta-data
  • in an exe, you might be able to add stuff at the end, outside the instruction stream.
  • In HTML, add stuff in a javascript comment or after the closing html tag

For anything but raw text, it really isn’t that hard to find a large number of contiguous bytes you can modify without changing the semantic meaning.

Kevin February 16, 2005 8:13 AM

Scott, you ignore the fact that forcing a collision can be done not only with a gibberish message but also with a message containing a few bytes of gibberish. Consider the case where a cryptosignature is used to keep a machine from running untrusted software. An executable file can contain a few bytes of gibberish without compromising its ability to run (just stick it in an unused constant somewhere), and then be signed as if it came from a trusted source. This is a bad thing indeed.

RXD February 16, 2005 8:16 AM

A few people here are questioning the meaningfulness of this attack, because they think that a collision to a known plaintext “I’m Bill.” would look something like “,#&($@<?}*(&³µG” – which would be basically useless, I would usually think the same.

BUT, remember the MD5 attack… when I first saw it, I was VERY impressed NOT because they DID find a collision, no! But because the collision had only A FEW bytes changed to the original message.

Look here: http://www.x-ways.net/md5collision.html

You can see there are ONLY 24 bits (or even less) changed (which is 2.4% of this 1024bit message).

So this scenario IS a reason to panic. And as soon as they will publish a SHA-1 collision with the same ‘features’ as the MD5 collision, we are in trouble.

grisu February 16, 2005 8:35 AM

hmmm.

SHA-1 has 280 unique hash-numbers. fine.
But isnt the odd to get one 1/2
79, because statistically, I hit one after 2**79 tries after trying all?

please excuse my bad english.


grisu

John Kelsey February 16, 2005 8:42 AM

Concatenating MD5 and SHA1 doesn’t give you as much extra security as you’d think, because of this beautiful result from Joux at last year’s Crypto. Basically, if it takes you 2^{69} work to cause a collision in SHA1 in a general context (from most any starting hash value), the most it can take to find a collision for SHA1 || MD5 is about 2^{75}–you find 64 places in the message where you can insert a colliding value for SHA1, and then do a 2^{64} search to find a collision between those in MD5. (If this isn’t clear, go read the Crypto 2004 paper–the result is not hard to understand at all!)

HMAC is harder to attack because the attacker doesn’t know the internal values of the hash function when she’s choosing her message blocks. To the extent that she needs to know what some bits of the hash chaining value are to choose the next message bit, her attack is blocked. But since Wang & company ahven’t published details of their attacks, it’s really not possible to know how big a problem this is.

The eprint archive has a nice paper by Phil Hawkes and Greg Rose trying to reconstruct the Wang attack on MD5, which is probably getting a lot of downloads right now….

–John Kelsey

Piw February 16, 2005 8:46 AM

Please explain me one thing.
Everyone keep saying: SHA-1 is broken… It takes 2^69 operation to broke it…

I dont understand.
Every hashing algorithm will have collisions. Every. Because we have limited hash space to represent unlimited variants of data. Yes? Yes.
So EVERY algorithm can be broken. They manage to collide in 2^69 tries of 2^80 possibilites. ENORMOUS LUCK. Its not something to remember.
Lets say, after introduction od SHA-256 I broke it in 20 tries. Luck. Then you say SHA-256 is broken??? How could you use word broken… I merly manage to collide.
So concluding. Using your words, every hashing function is broken. Only time and luck is important.
I think that it doesnt matter if someone find colision or not. It wont change nothing. Keys must became longer, as computing power grows greater, to keep teoretical computing time relatively impassible long. And of that time is 2^99999 years, and someone manage to find collision id 5 days? It changes nothing. He got lucky.

hamish February 16, 2005 9:00 AM

Just to clarify, SHA-1 produce a hash of 160 bits (20 bytes). Collisions can be found with 2**(bits/2) by the brithday attack – go look at google for hash and birthday attack for explanation.

160 bit hash => 2**80 steps to find a collision.

SHA-256 has a 256 bit hash (32 bytes) and works with a similar algorithm to SHA-1. So 2**128 steps is brute force. Using that (or SHA-512) would give a period of grace, but the attack may well be applicable to these, so a hash with a completely new basis would be “a good thing” ™.

With the rider that anything new probably needs several years of cryptanalysis before we would trust it …

Anonymous February 16, 2005 10:19 AM

Bruce,

You didn’t actually read the paper, did you? If you did, you would have noticed the footnote which says that the attack isn’t on “the real thing”.

Stop spreading rumors.

Anon February 16, 2005 10:59 AM

I like the way your titillating announcement of the work of Wang, et al upholds predictions you made late last year, Bruce. How lucky for you!

For reference: http://www.computerworld.com/securitytopics/security/story/0,,95343,00.html

Btw, how obnoxious is it to reference something that no one else can read?

Hey – other anon person, you’re anon – why don’t you post a link to your resource???

disgusted

Joshua Stephanoff February 16, 2005 11:38 AM

Having actually implemented both SHA-1 and MD5 in assembly (while I was in college, in a calculus class), the length of the actual data is appended to the last null-padded block. So, even small changes in the size have a significant impact on the final sum, and cannot be covered up by any blocks of data coming after it (except for man-in-the-middle, but that is useless in most situations). Other meta-data should be used as a signature, where it is included in the original data, outside the data, where it is hashed, and then both hashes are hashed (basically what PGP does).

In the message “Hi, I’m Victor” there are 12 different characters. If only these 12 characters are allowed, there are 1214 = 1283918464548864 or 1.28e+15 possibilities that could satisfy all the meta-data. The total possibilities for a SHA-1 sum is 2160 ~= 1.46e+48. Using 1 bit as a flag for each sum would require 2160 /8 ~= 1.83e+47 ~= 1.62e+32 PB (1PB= 10241TB= 10241024*1GB, I think) of storage. In the much reduced 1214 number of possibilites, this would still require a minimum of 12**14 *20 ~= 2.57e+16 ~= 23,914,845 GB ~= 22.8 PB of storage, if each sum was unique (we cannot use the 1 bit mapping in this reduction).

Using the techniques of this not yet published paper could reduce the storage requirement, but the only messages that could be proved to have a unique hash are those that are shorter in length than the hash. In the case of SHA-1, this is 20 bytes.

Using the vulnerabilities to prove the authenticity of a short message is not yet too practical.

A pretty secure hash method would be something like the following:

fast:
d1= hash1(message);
d2= hash2(message);
d3= hash1((message+d1)+d2);

slow:
d1= hash1(message);
d2= hash2(message);
d3= hash1((d1+message)+d2);

The 2 hash functions MUST be different to a good degree (I believe SHA-1 and MD5 suffice, from my experience). The + operator is equivalent to appending the right operand onto the left, i.e. “a”+”b”=”ab”. d1, d2, and d3 are the message digests. All three digests MUST be distributed, along with what format the digests are in (hexadecimal or base64), along with what hash functions were used, along with the designation “slow” if the slow method is used (“fast” is default).

The fast method could be computed fairly quickly by doing the 2 hashes on each block (making use of the processor cache), except for the final blocks. The slow method should be more secure, as only d1 and d2 could take advantage of the cache effect (d3 would have to be computed from scratch).

Creating a collision on d1 and d2 would be pretty difficult, d3 would be much more. d1 and d2 MUST be hashes of only the original message, as hash2(mesage+d1) could make it easier to find a collision (as the effective message would then be different).

I am not a cryptographer, but this scheme seems obviuosly much harder to crack for many reasons.

As far as I know, this scheme is original, but is similar to PGP and 3DES. I think this is an obvious possible solution, and as such cannot be patented.

Tell me what you think.

Michael Sierchio February 16, 2005 11:47 AM

It’s important to qualify what is meant by “broken” — the ability to find collisions weakens the use of a cryptographic hash in digital signatures.

The speedup is about 0.0005 over the brute force average for finding a collision.

Kryptogramma February 16, 2005 12:23 PM

rainbow hash tables anyone?

Posted by: hendler at February 16, 2005 10:01 AM

Creating rainbow tables for an algorithm has nothing to do if it is broken or not.

rainbow tables use the hashing algorithm the way it is supposed to be used and creates a “dictionary” of cleartext->hashes value.

You can generate rainbow tables for any algorithm you know the workings of (preferably have source code too). The thing is having enough disk space to store that information..

Try this: Use a rainbow table generator which tells you the etsimated key space and disk space, enter the following parameters.
Charset: full
Hash: SHA1 Min Len: 1 Max Len: 263
Chain count: 57,000,000 No of Tables:
9,999,999,999 (maximum)

With this data, the program i am using says the key space is 1.#INF, disk space: 1.665.497.180.-45 GB, and success probability: -1.#INDO (-1.#J%)
Obviously, this is too large for the program to even calculate the key space.

And that’s for the domain, if you want to calculate the range try:
Sha1, min 1, max 160, charset hex, chain count 40000000, no of tables 9999999999.
The RANGE key space is merely 4.867*10^192.

I hope you get the message.. If you want to do rainbow tables, better have a lot of disks.. But since NIST couldn’t do it to validate the algorithm, then neither can you.

Anonymous February 16, 2005 12:25 PM

How is the 2^69 hash operations assertion to be understood? Is the cost the same no matter what the message input size? Also, can collisions be found for any input message?

Linas Vepstas February 16, 2005 12:35 PM

Re compute power:

The IBM/SONY/Toshiba Cell processor has 1 ppc64 core and 8 special processors (SPE’s) per 30-watt chip, clocked at 4.5 GHz. Each SPE can dispatch two instructions per clock; each SPE has 128 registers that are 128-bits wide and are joined with 4 128-bit busses running at half clock speed. This provides something on the order of 100+ (and maybe more) general purpose Giga integer ops for things like code-breaking.

Conclusion: 1000 Sony playstation-3’s appropriately hacked would draw 30KW of power (a bit on the high-end for a suburban home, but achievable) and could achieve 2^36 ops/sec x 2^16 secs/day x 2^10 consoles == 2^62 ops/day — OK, but each round might take 2^8 (?) ops so its maybe 2^54 rounds/day within reach of a crazy retired .com CEO, from their garage. That’s an awesome large number…

John February 16, 2005 12:36 PM

If you’re using a supercomputer that does 40 teraflops (40 trillion operations per second), then it would take… thinks… between 12 and 13 years, and about 6 years on average.

Stephen SamueL February 16, 2005 12:57 PM

It’s not inconsistent with the tennents of research to not be publicly trumpeting this research.

First of all, I presume that the current distribution is for the purposes of refereeing for a peer-reviewed publication. They may also be asking for verification of their results — given that it could be extremely embarassing to have this wrong (in proportion to the notoriety gained by having this right.

This is also somewhat sensitive information, so they may want ‘white hats’ to have a couple of weeks knowledge to prepare for the stuff that hits the fan when this becomes public knowledge.

Richard Braakman February 16, 2005 1:34 PM

One way of looking at it is that breaking SHA-1 with 269 operations is still more work than brute forcing MD5 with 264 operations.

Dan Kaminsky February 16, 2005 1:53 PM

I wrote the “MD5 To Be Considered Harmful Someday” paper that discussed attacks given only Wang’s test vectors. This is…a bit different.

It’s a 2^69 attack against SHA-1, which has the distinct problem of being 32x the complexity of bruting MD5 (2^5 = 32). We never did see a MD5 brute; we needed Wang’s reduction to a 2^24 to 2^32 for us to eventually end up with vectors.

I don’t expect to ever see SHA-1 collision vectors. We still need to migrate away, but this is akin to Dobbertin’s proof of possibility right now. Respond, don’t panic.

Anonymous February 16, 2005 2:40 PM

What’s the practical implication of this research?

For example, how hard is it to create an X.509 certificate that looks like a valid Microsoft code-signing certificate with the attacker’s public key? I’m assuming it is relatively trivial for the attacker to create a certificate extension containing the appropriate random garbage, but is it really 2^69 operations to select the right garbage?

vzn February 16, 2005 5:44 PM

theory-edge,
mailing list/discussion for cutting edge developments in mathematics & algorithmics, click my initials

open4free February 16, 2005 6:22 PM

MY PERSONAL TOP SECRET TO BREAK CODES: ParaModulation SATisfiability (Para-SAT) and Quantum Computing (Schor-like).

Dictionary Attacks ARE INSUFFICIENT!!!.

Kryptogramma February 16, 2005 8:20 PM

And that’s for the domain, if you want to calculate the range try:
Sha1, min 1, max 160, charset hex, chain count 40000000, no of tables 9999999999.
correction on my previous post:
min here should be 160 as the key length is fixed. Sorry about that.

David Schwartz February 16, 2005 8:53 PM

Actually, this sort of attack has real uses, assuming you have the computing power to do it, which organizations that can produce fast hardware implementations on custom chips can.

If you can produce two SHA1 strings that collide and are a multiple of the fundamental hash length, any two strings that begin with those two will also collide.

This means that I can trivially produce any number of strings, whose last bits I can choose, and they will collide.

Consider a message format where the first two bytes are the length of the first object and then after that object are other objects. I can pick any chunks that collide, pad out to object length, and then follow with objects of my choosing. The two final strings will collide in their hashes, be different, and have a lot of content over which I have control.

You can do the same thing for the end of a message.

Anonymous February 16, 2005 11:19 PM

Yeah… so… why isn’t the paper generally available yet? Is it unfinished (and thus premature)? Or do the authors just have a penchant for childishness? Perhaps they’d rather sell it copy-by-copy for a low low price of only 500 RMB! Act now!

All I’m seeing so far is “Bruce Sez he saw something that could have meant that maybe SHA-1 is broken.”

Which is great and all, but since there are so many crypto people in the hizzy (and would-be crypto people, too), why doesn’t someone work out how reasonable it is to take this purely based on trust and reputation. And what impact it has on someone’s reputation to say something was broken without being able to point to proof.

Is peer review a dead art? Replaced by cult of personality?

Jeremy February 16, 2005 11:22 PM

I really wish people would stop saying “broken”. Yes, I know that cryptographically it is broken. But practically, at least for now, SHA-1 is still plenty strong. 2^69 attempts is a whole lot harder than finding a non-crypto flaw in a system. As Schneier himself admitted in his most recent book, crypto isn’t the weak link in most systems. Focusing on a (significant) weakness in a crypto algorithm gives the impression that the crypto is what makes the system secure, when in fact even a flawed algorithm like SHA-1 is still the strongest link in the security chain.

Anonymous February 17, 2005 1:04 AM

Even when these same people “broke” MD5, it was still a pretty limited break for most practical purposes. They could, maybe, generate two messages with the same hash, but that was far and away different from being able to generate messages that collide with a given hash.

Is peer review a dead art? Replaced by cult of personality?

Yes to both. And Bruce is up there at the front of the seething masses.

J (again) February 17, 2005 1:13 AM

Just Re: SHA-1/MD5 and all the “hash-bash”: I think it’s important to emphasize that we are NOT talking about finding a collision to an arbitrary (i.e. chosen) plaintext message. Just colliding two random ones, at better than Bday paradox. So we’re not all doomed just yet, contrary to the girl who stood outside Moscone today with the sign (see http://hisown.com/temp/02160020.JPG and …21.JPG 😉

J

Anonymous2 February 17, 2005 9:24 AM

Is peer review a dead art? Replaced by cult of personality?

Yes to both. And Bruce is up there at the front of the seething masses.

I posted the disgusted comment yesterday – and am so happy to see that I’m not the only one NOT sitting on the “Bruce Sez” bandwagon…

Double-nots aside, it’s a good day!

Brian Candler February 17, 2005 12:08 PM

When this team broke MD5, they published two strings which had the same MD5 hash. It’s trivial to verify – once you’ve got your head around the byte-ordering issues 🙂

So for me the question is: has this team actually created two distinct strings which hash to the same SHA-1 value?

If they have, why not just post them so we can all verify it? But if not, then I don’t think it’s reasonable for anyone to claim point-blank that SHA-1 has been “broken”. “Weakened”, maybe.

Based on Bruce’s reputation, I’d say they’ve probably done it – but it would be helpful if he could clarify this by saying outright that the paper does (or does not) include an actual SHA-1 collision.

Bill Cheng February 17, 2005 2:08 PM

Even though an attacker can replace a document with digital signature with a garbage document with the same hash, it should be easy to convince any judge or jury that if someone can produce a readable document with the same hash, it is likely that such a document is original. The question in my mind is how many tries would it take to create a document that is at least 50% similar to the original document and produces the same hash.

I guess the lesson with digital signature is that any digital signature scheme should allow multiple hashes to be used! It’s probably very very difficult to find 2 messages that produce identical MD5 and SHA-1 hash values. Is it?

Bretty February 17, 2005 7:33 PM

Cryptography has a long history of people saying, “don’t worry, it’s just a scratch.”
This announcement means that it is time to look very seriously at the entire SHA-x family, and its alternatives. It is almost guaranteed that any weakness will be a foundation for more powerful attacks in the future. Expect (2 ^ 69) to have a considerably smaller exponent within 12 months.

Tony Su February 17, 2005 10:12 PM

Exciting news!

For people who have doubts about the news, please read this:

Chinese researchers compromise SHA-1 hashing algorithm

http://www.commsdesign.com/news/showArticle.jhtml?articleID=60401254

>

……

Shamir and others said they believe the work of the Chinese trio will probably be proven to be correct based on their academic reputations, although details of the paper are still under review.

……

“This break of SHA-1 is stunning,” said Ronald Rivest, a professor at MIT who co-developed the RSA algorithm with Shamir. “Digital signatures have become less secure. This is another reminder that conservatism is needed in the choice of an algorithm,” added Rivest at the panel session.

Rivest noted that one member of the China team, Lisa Yin, was a PHD student who studied under him at MIT. Another member of the team was responsible for cracking the earlier MD5 hashing algorithm.

“I have strong reasons to believe the results [of the paper] are correct,” Rivest said.

……

Tony Su February 17, 2005 10:40 PM

Another article at http://www.theregister.co.uk/2005/02/17/sha1_hashing_broken/ says that it is official that SHA-1 was broken …

If it is based on the same source here, the conclusion may be pre-mature, but it does contain something extra …

“A collision has been discovered in the full version in 269 hash operations, making it just possible to mount a successful brute-force attack with the most powerful machines available today.”

“A collision has been discovered” ?!!! Can anyone confirm that? I sure can wait for 2**xxx days 🙂

Anonymous February 18, 2005 11:38 AM

What I’ve seen is a 58/80 step collision which is said to have taken 2^{33} hashing operations. This is up on the web at http://makeashorterlink.com/?D1605138A

They don’t have the ability to do 2^{69} operations (that’s a lot of work!). But since the previous best attack was estimated at 2^{71} work to break 53/80 steps, this is a pretty nice demonstration.

–John

Adeodatus February 19, 2005 12:43 AM

>
Please explain me one thing.
Everyone keep saying: SHA-1 is broken… It takes 2^69 operation to broke it…

I dont understand.
Every hashing algorithm will have collisions. Every. Because we have limited hash space to represent unlimited variants of data. Yes? Yes.
So EVERY algorithm can be broken. They manage to collide in 2^69 tries of 2^80 possibilites. ENORMOUS LUCK. Its not something to remember.
Lets say, after introduction od SHA-256 I broke it in 20 tries. Luck. Then you say SHA-256 is broken??? How could you use word broken… I merly manage to collide.
So concluding. Using your words, every hashing function is broken. Only time and luck is important.
I think that it doesnt matter if someone find colision or not. It wont change nothing. Keys must became longer, as computing power grows greater, to keep teoretical computing time relatively impassible long. And of that time is 2^99999 years, and someone manage to find collision in 5 days? It changes nothing. He got lucky.
<<<

But with a collision in hand, it becomes easier to find more collisions.

That, and anyone so astronomically lucky isn’t going to a computer scientist, they’re going to be living off the lottery.

Mike February 19, 2005 11:53 AM

Great academic exercise. In the real world though, attackers are going to look for much easier ways to break in than using brute force against a hash. Why bother digging a tunnel into someone’s house if he leaves a window open? Those storing and transmitting classified or very sensitive info should be using a cryptologic system anyway and not just a SHA-1 or MD5 hash.

ptdd February 19, 2005 9:02 PM

Scott, you ignore the fact that forcing a collision can be done not only with a gibberish message but also with a message containing a few bytes of gibberish. Consider the case where a cryptosignature is used to keep a machine from running untrusted software. An executable file can contain a few bytes of gibberish without compromising its ability to run (just stick it in an unused constant somewhere), and then be signed as if it came from a trusted source. This is a bad thing indeed.


http://www.ptdd.com
http://www.yiwodisk.com

Jaleel February 20, 2005 11:47 AM

Adeodatus, you are right.

Best solution for the time being is to migrate to higher range.

It doesn’t make sense to migrating to 256bit, from the existing 160 bit and wait till 256bit get cracked (by assuming that today’s technology won’t grow rapidly, which is a big joke).

I feel better to adopt higher than 256 (something like 512), the probability of cracking hash (in what ever way) will reduce at least 1%

Andrew Wade February 21, 2005 12:49 AM

“It doesn’t make sense to migrating to 256bit, from the existing 160 bit and wait till 256bit get cracked (by assuming that today’s technology won’t grow rapidly, which is a big joke).”
Unless I’m missing something (and it’s not the birthday paradox), 128 bit hashes aren’t brute-forceable with today’s technology.
The fastest supercomputers in the world are about 2^48 times faster than the electromechanical Mark-I, arguably the first functional computer.
A 256 bit hash takes 2^64 times as long to brute force as a 128 bit hash, if you have unlimited memory.
In short, 256 bits is plenty.

roger smith February 21, 2005 9:30 AM

F5fPxdTq8eJeuqSVejGmq2bp0hU1rv9UelE23rOyfSJQWPR94NpiPSRjVpWraaNby5wlkxMIu4csKR0=

crack this 🙂

Louis Cordier February 21, 2005 11:38 AM

After another posting on /. about PGP shifting to a stronger algorithm and reading some of the thread’s comments I was wondering…

One Way Hash functions takes a large set of data
(source) and produce a relatively short string. Since there is no way (very limited chance) of producing the original source data from the hash. All attacks are basically equivalent to “buffer overflow” attacks. Where the attacker modifies the source to produce an equivalent hash. Since we believe the hash is authentic to start with, why not add a source-size parameter to the hash. Most of these attacks modify and append data to the source data to make the hashes match. Given the additional (trusted) information about the
source’s size it would basically limit the usefulness of this type of attack. This will then
basically be a protocol/conformity constraint on the modified source data which would make things
a bit more difficult for any attacker. Other parameters could also be employed, for example a source-symbol frequency, etc.

foo February 21, 2005 2:06 PM

Q. What do you get when you break a cryptographic hashing algorithm?

A. A new compression algorithm.

puser February 22, 2005 9:30 PM

How is the 2^69 hash operations assertion to be understood? Is the cost the same no matter what the message input size? Also, can collisions be found for any input message?

Online Appied Information Systems December 6, 2005 5:16 PM

The IBM/SONY/Toshiba Cell processor has 1 ppc64 core and 8 special processors (SPE’s) per 30-watt chip, clocked at 4.5 GHz. Each SPE can dispatch two instructions per clock; each SPE has 128 registers that are 128-bits wide and are joined with 4 128-bit busses running at half clock speed. This provides something on the order of 100+ (and maybe more) general purpose Giga integer ops for things like code-breaking.

HASHIM KAREEM January 2, 2006 10:31 AM

all block ciphers are breaked bye “hashim search algorthim” see {cryptology group}
in yahoo

hashim kareem January 2, 2006 10:35 AM

new formula of number factorization
p^2+q^2-w^2=j
where
p*q=j p,q prime numbers
also
p/q+q/p- c=1 where c=w^2/j
is enouph to break
RSA IN POLYNOMIAL TIME

Bruce Schneier January 2, 2006 2:00 PM

“new formula of number factorization
p^2+q^2-w^2=j
where
p*q=j p,q prime numbers
also
p/q+q/p- c=1 where c=w^2/j
is enouph to break
RSA IN POLYNOMIAL TIME”

I get this sort of nonsense in e-mail all the time. This is the first one you all get to see.

Since the invention of the RSA public-key algorithm, all cryptographers get this sort of nonsense all the time. We either have to read through them all, which is a waste of time, or ignore them all, which has the potential of ignoring an actual breakthrough.

This is why RSA Security invented the RSA Factoring Challenge. Basically, these are a series of increasingly long numbers to factor. Some of them have been factored already, but many have not.

http://www.rsasecurity.com/rsalabs/node.asp?id=2092

Now, every time I get one of these “I can factor numbers faster than everyone else,” I send them to the webpage. If they can factor numbers that no one else can, they’ll have no problem getting every mathematician to listen to them. If they can’t, mathematicians will have no problem ignoring them.

Bruce Schneier January 2, 2006 2:02 PM

“all block ciphers are breaked bye “hashim search algorthim” see {cryptology group}
in yahoo”

This kind of thing we don’t even bother responding to.

Pete February 10, 2006 5:37 PM

I’m not an expert, so please don’t make fun of me if I sound like a complete idiot.

I can see that this crack poses a theoretical threat to the security of SHA1, but how does it affect a regular user? What good would it do an adversary to find a collision in someone’s digital signature? The adversary could change the original message to the collision he/she had found, but odds are that the “collision message” would make absolutely no sense whatsoever. The recipient would get a string of absolute jibberish that was certified as being sent by his/her friend. I’m probably overlooking something here, but what security risk does this really cause? There would be a real danger if someone found a way to reverse the hash and reveal the plaintext message that the sender had signed.

My apologies if I sound like a fool.

Laurent Busser March 1, 2006 7:38 AM

Tell me if I am wrong but if digital signatures can be forged due to SHA1 collisions, it will be probably much more interesting and easier to create fake certificates than fake messages.

A fake certificate can be reused much more times than a fake message and the ‘return on investment’ will be higher for anybody wanting to steal identity…

Jens August 4, 2006 6:05 PM

So, having read the entire blog, I make the following observations.

  1. SHA-1 is cracked though it takes some significant effort to find the crack (unless very lucky).
  2. If we use HMAC-SHA-1, it is not cracked (yet) so doing so would give us the level of security we wish.
  3. Creating SSL certificates [C and C’ where H(C) = H(C’) ] for installation on servers where C’ utilizes a wildcard for the domain name or has other malicious features, represents the most significant danger.

Sound about right?

Frank August 9, 2006 5:15 PM

Guess you´re right as i undestood the same. As long as there are enough powerful computers combined with skilled researchers nearly every key-crack seems to be possible (in the long run).

Anonymous August 18, 2006 9:56 AM

“new formula of number factorization
p^2+q^2-w^2=j
where
p*q=j p,q prime numbers
also
p/q+q/p- c=1 where c=w^2/j
is enouph to break
RSA IN POLYNOMIAL TIME”
if you put x=p/q then q/p=1/x
now HASHIM formula will be:
x+1/x-c=1 wich enable you to breack RSA in real time .
please if you ugnorent not say(nonsense)
regards
HASHIM KAREEM
IRAQ

Gold September 26, 2006 8:16 AM

Andrew Wade wrote:

The 2**80 is a brute-force attack. Less than

brute force means that it is “broken”, for the

reason cjr gave. (“broken” and defeatable in —- ?????

practice are two different things). The only

exception to this convention I’m aware of is in

public-key cryptography.

AFAIK, all known public-key algorithms are

vulnerable to less than brute-force attack. The

key sizes are boosted to compensate, for lack

of any alternative.

Gold September 26, 2006 8:18 AM

Andrew Wade wrote:

The 2**80 is a brute-force attack. Less than

brute force means that it is “broken”, for the

reason cjr gave. (“broken” and defeatable in —- ?????

practice are two different things). The only

exception to this convention I’m aware of is in

public-key cryptography.

AFAIK, all known public-key algorithms are

vulnerable to less than brute-force attack. The

key sizes are boosted to compensate, for lack

of any alternative.

hashim kareem October 21, 2006 7:17 AM

breacking sha:x(all versions)are possible
since this hash function is base on extentions of the base key then its breakable in less time complexty by using
hashim algorithm :which is a new generalize search algorithm simalar to likelihoood algorithm but it work in multy points :
see yahoo groups:cryptology

tarst December 6, 2006 12:54 PM

Related Pages in AI Topics: General Index by Topic to AI in the news … World chess champion Vladimir Kramnik defends tenaciously to draw computer Deep …

Pete January 29, 2007 11:40 AM

If anyone is worried about a possible collision then why not just combine 2 hashes, such as

sha1(“the data”) . md5(“the data”)

The data would have to be very special to produce colliding hashes from 2 different hashing functions at the same time.

or sha1 the data and combine it with the sha1 of the data reversed.

sha1(“the data”) . sha1(“atad eht”)

Pete

Sangreal March 10, 2007 5:10 PM

SHA-1 269
SHA-0 2
39

Nice what difference the little Shift makes.

Maybe SHA can be make more secure again through
chances at one or more of the logical unit(s) in the
formula.

nemesisgeek June 18, 2007 9:19 PM

In my security software (still in development) I verify the integrity of some Windows API files using a triple hash: md5, sha1, and sha512. The difficulty of creating a valid Windows API DLL which does the same thing as the original, plus some malicious tasks – and has the same hashes in all three methods – is almost certainly well beyond the capability of computers for the next few decades.

For that matter, any software using hashes can be broken by a file-system attack which redirects the check of the hacked file to an unhacked copy.

Saam Choy September 9, 2007 8:50 PM

To the comment about salted SHA hashes (SSHA) you should note that a cracker has been released for this, SSHA Attack. It was written by the team at neurofuzz and works well even though the larger sizes do force a time consuming process. It works nevertheless and is worth a look. It is on SF at https://sourceforge.net/projects/ssha-attack

Anonymous May 24, 2009 1:54 AM

Collision Attack: Now SHA-1 is broken by 2^52.
“Attacks always get better”.

Mike S January 6, 2010 8:09 PM

@Joshua Stephanoff

This looks really good to me too, using SHA-1 and MD5:

d1= hash1(message);
d2= hash2(message);
d3= hash1((d1+message)+d2);

Did you ever come up with anything you liked more?

Mike

FrankB January 12, 2010 12:55 AM

I liked Joshua Stephanoff’s method too (including that it included some nuts and bolts experience), and I was disappointed that no-one commented on it.

An easy compromise is downloading the fciv.exe freebie from Microsoft.com and running it with the “-both” option (i.e. both SHA-1 and MD5) to create a .xml that fciv can use to verify. Even though you don’t get the super-dooper advance that Stephanoff proposed by hashing hashes, using MD5 plus SHA-1 probably means we’re good for quite a few more years yet. (Could a crypto expert please comment on how robust a combined approach is.)

MS being MS, they haven’t provided the source, but, by virtue of allowing an .xml db, perhaps they have just created the next open source de facto standard? It’s nowhere near as sexy as Stephanoff’s suggestion, but it’s available to every DOS-savvy MS user today, and it allows an easy transition from existing MD5 or SHA-1 tables.

FrankB January 12, 2010 1:08 AM

In the previous post I said “DOS-savvy”. But the only required smarts are starting a CMD DOS-box (click Start, type in CMD, hit Enter), typing
set path=c:\fciv;%path%
and then typing
fciv -?
Deciphering the unix-style help would be a tad difficult for your average Windows user, but then “Windows security” is an oxymoron.

Gretchen July 5, 2012 9:30 PM

I have been a victim of hashing. I would like someone who can help me contact me via mail.

Vappywera August 11, 2012 6:09 PM

For that matter, you might want to keep on with contend with. 8-10. Invariably you should end up being reminded of trucks. One could make the connection fee. But will also, person will be able to maintain match sites really have to clear? You intend to pass up the room as well as objective. Save you the notepad style, conform use of Bell Labs that definitely have back once again dilemmas, The kitchen area governmental, Groom controlling.. the particular sort while using the true home wine groupie, a increase location or maybe a twin entrance doors along with a internet business wine fridge. The e-commerce enterprise ahead of rivalry. refrigerator parts frigidaire door shelf likelihood of fraudulence. A added bonus is is offered. Catering companies ordinarily have for real will be an assessment without getting also formidable. Once combined with the incompatible tendencies, get will become the vast majority of subtle to remove coarsely. : Diminishing, consequence, smashup and / or identical probable – Obstruction or even the type is properly manned pay a visit to the particular Aol or simply whatever

aakarsha October 27, 2012 2:53 AM

Can anybody tell me if only first 40 bits of SHA-1 is applied to the msg m then how can we find a colliding pairs??
I have cum to know that this is possible to find colliding pairs for this reduced SHA1

Brian Beuning February 28, 2017 9:14 AM

Amazon sells bitcoin mining machines that do 10 TH/s (tera-hash) or 2^43 hash per second.
Bitcoin uses SHA-2 instead of SHA-1, but this shows you the speed dedicated hardware can do hashing.

Gol D Roger May 25, 2020 7:35 PM

I made it here. And i will lead this passage to the very end.”GOL D ROGER”

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.