Entries Tagged "hashes"

Page 4 of 6

MD6 Withdrawn from SHA-3 Competition

In other SHA-3 news, Ron Rivest seems to have withdrawn MD6 from the SHA-3 competition. From an e-mail to a NIST mailing list:

We suggest that MD6 is not yet ready for the next SHA-3 round, and we also provide some suggestions for NIST as the contest moves forward.

Basically, the issue is that in order for MD6 to be fast enough to be competitive, the designers have to reduce the number of rounds down to 30-40, and at those rounds, the algorithm loses its proofs of resistance to differential attacks.

Thus, while MD6 appears to be a robust and secure cryptographic hash algorithm, and has much merit for multi-core processors, our inability to provide a proof of security for a reduced-round (and possibly tweaked) version of MD6 against differential attacks suggests that MD6 is not ready for consideration for the next SHA-3 round.

EDITED TO ADD (7/1): This is a very classy withdrawal, as we expect from Ron Rivest—especially given the fact that there are no attacks on it, while other algorithms have been seriously broken and their submitters keep trying to pretend that no one has noticed.

EDITED TO ADD (7/6): From the MD6 website:

We are not withdrawing our submission; NIST is free to select MD6 for further consideration in the next round if it wishes. But at this point MD6 doesn’t meet our own standards for what we believe should be required of a SHA-3 candidate, and we suggest that NIST might do better looking elsewhere. In particular, we feel that a minimum “ticket of admission” for SHA-3 consideration should be a proof of resistance to basic differential attacks, and we don’t know how to make such a proof for a reduced-round MD6.

Posted on July 1, 2009 at 2:27 PMView Comments

New Attack on AES

There’s a new cryptanalytic attack on AES that is better than brute force:

Abstract. In this paper we present two related-key attacks on the full AES. For AES-256 we show the first key recovery attack that works for all the keys and has complexity 2119, while the recent attack by Biryukov-Khovratovich-Nikolic works for a weak key class and has higher complexity. The second attack is the first cryptanalysis of the full AES-192. Both our attacks are boomerang attacks, which are based on the recent idea of finding local collisions in block ciphers and enhanced with the boomerang switching techniques to gain free rounds in the middle.

In an e-mail, the authors wrote:

We also expect that a careful analysis may reduce the complexities. As a preliminary result, we think that the complexity of the attack on AES-256 can be lowered from 2119 to about 2110.5 data and time.

We believe that these results may shed a new light on the design of the key-schedules of block ciphers, but they pose no immediate threat for the real world applications that use AES.

Agreed. While this attack is better than brute force—and some cryptographers will describe the algorithm as “broken” because of it—it is still far, far beyond our capabilities of computation. The attack is, and probably forever will be, theoretical. But remember: attacks always get better, they never get worse. Others will continue to improve on these numbers. While there’s no reason to panic, no reason to stop using AES, no reason to insist that NIST choose another encryption standard, this will certainly be a problem for some of the AES-based SHA-3 candidate hash functions.

EDITED TO ADD (7/14): An FAQ.

Posted on July 1, 2009 at 11:49 AMView Comments

Ever Better Cryptanalytic Results Against SHA-1

The SHA family (which, I suppose, should really be called the MD4 family) of cryptographic hash functions has been under attack for a long time. In 2005, we saw the first cryptanalysis of SHA-1 that was faster than brute force: collisions in 269 hash operations, later improved to 263 operations. A great result, but not devastating. But remember the great truism of cryptanalysis: attacks always get better, they never get worse. Last week, devastating got a whole lot closer. A new attack can, at least in theory, find collisions in 252 hash operations—well within the realm of computational possibility. Assuming the cryptanalysis is correct, we should expect to see an actual SHA-1 collision within the year.

Note that this is a collision attack, not a pre-image attack. Most uses of hash functions don’t care about collision attacks. But if yours does, switch to SHA-2 immediately. (This has more information on this, written for the 269 attack.)

This is why NIST is administering a SHA-3 competition for a new hash standard. And whatever algorithm is chosen, it will look nothing like anything in the SHA family (which is why I think it should be called the Advanced Hash Standard, or AHS).

Posted on June 16, 2009 at 12:21 PMView Comments

Forging SSL Certificates

We already knew that MD5 is a broken hash function. Now researchers have successfully forged MD5-signed certificates:

Molnar, Appelbaum, and Sotirov joined forces with the European MD5 research team in mid-2008, along with Swiss cryptographer Dag Arne Osvik. They realized that the co-construction technique could be used to simultaneously generate one normal SSL certificate and one forged certificate, which could be used to sign and vouch for any other. They purchased a signature for the legitimate certificate from an established company that was still using MD5 for signing, and then applied the legitimate signature to the forged certificate. Because the legitimate and forged certificates had the same MD5 value, the legitimate signature also marked the forged one as acceptable.

Lots and lots more articles, and the research.

This isn’t a big deal. The research is great; it’s good work, and I always like to see cryptanalytic attacks used to break real-world security systems. Making that jump is often much harder than cryptographers think.

But SSL doesn’t provide much in the way of security, so breaking it doesn’t harm security very much. Pretty much no one ever verifies SSL certificates, so there’s not much attack value in being able to forge them. And even more generally, the major risks to data on the Internet are at the endpoints—Trojans and rootkits on users’ computers, attacks against databases and servers, etc—and not in the network.

I’m not losing a whole lot of sleep because of these attacks. But—come on, people—no one should be using MD5 anymore.

EDITED TO ADD (12/31): While it is true that browsers do some SSL certificate verification, when they find an invalid certificate they display a warning dialog box which everyone—me included—ignores. There are simply too many valid sites out there with bad certificates for that warning to mean anything. This is far too true:

If you’re like me and every other user on the planet, you don’t give a shit when an SSL certificate doesn’t validate. Unfortunately, commons-httpclient was written by some pedantic fucknozzles who have never tried to fetch real-world webpages.

Posted on December 31, 2008 at 1:39 PMView Comments

More SHA-3 News

NIST has published all 51 first-round candidates in its hash algorithm competition. (Presumably the other submissions—we heard they received 64—were rejected because they weren’t complete.) You can download the submission package from the NIST page. The SHA-3 Zoo is still the best source for up-to-date cryptanalysis information.

Various people have been trying to benchmark the performance of the candidates, but—of course—results depend on what metrics you choose.

And there’s news about Skein’s performance. And two Java implementations. (Does anyone want to do an implementation of Threefish?) In general, the Skein website is the place to go for up-to-date Skein information.

Posted on December 11, 2008 at 1:16 PMView Comments

Skein and SHA-3 News

There are two bugs in the Skein code. They are subtle and esoteric, but they’re there. We have revised both the reference and optimized code—and provided new test vectors—on the Skein website. A revision of the paper—Version 1.1—has new IVs, new test vectors, and also fixes a few typos in the paper.

Errata: Version 1.1 of the paper, reference, and optimized code corrects an error in which the length of the configuration string was passed in as the size of the internal block (256 bits for Skein-256, 512 for Skein-512, and 1024 for Skein-1024), instead of a constant 256 bits for all three sizes. This error has no cryptographic significance, but affected the test vectors and the initialization values. The revised code also fixes a bug in the MAC mode key processing. This bug does not affect the NIST submission in any way.

NIST has received 64 submissions. (This article interviews one of the submitters, who is fifteen.) Of those, 28 are public and six have been broken. NIST is going through the submissions right now, making sure they are complete and proper. Their goal is to publish the accepted submissions by the end of the month, in advance of the Third Cryptographic Hash Workshop to be held in Belgium right after FSE in February. They expect to quickly make a first cut of algorithms—hopefully to about a dozen—and then give the community about a year of cryptanalysis before making a second cut in 2010.

Lastly, this is a really nice article on Skein.

These submissions make some accommodation to the Core 2 processor. They operate in “little-endian” mode (a quirk of the Intel-like processors that reads some bytes in reverse order). They also allow a large file to be broken into chunks to split the work across multiple processors.

However, virtually all of the contest submissions share the performance problem mentioned above. The logic they use won’t optimally fit within the constraints of a Intel Core 2 processor. Most will perform as bad or worse than the existing SHA-1 algorithm.

One exception to this is Skein, created by several well-known cryptographers and noted pundit Bruce Schneier. It was designed specifically to exploit all three of the Core 2 execution units and to run at a full 64-bits. This gives it roughly four to 10 times the logic density of competing submissions.

This is what I meant by the Matrix quote above. They didn’t bend the spoon; they bent the crypto algorithm. They moved the logic operations around in a way that wouldn’t weaken the crypto, but would strengthen its speed on the Intel Core 2.

In their paper (PDF), the authors of Skein express surprise that a custom silicon ASIC implementation is not any faster than the software implementation. They shouldn’t be surprised. Every time you can redefine a problem to run optimally in software, you will reach the same speeds you get with optimized ASIC hardware. The reason software has a reputation of being slow is because people don’t redefine the original problem.

That’s exactly what we were trying to do.

EDITED TO ADD (11/20): I wrote an essay for Wired.com on the process.

Posted on November 19, 2008 at 6:14 AMView Comments

U.S. Court Rules that Hashing = Searching

Really interesting post by Orin Kerr on whether, by taking hash values of someone’s hard drive, the police conducted a “search”:

District Court Holds that Running Hash Values on Computer Is A Search: The case is United States v. Crist, 2008 WL 4682806 (M.D.Pa. October 22 2008) (Kane, C.J.). It’s a child pornography case involving a warrantless search that raises a very interesting and important question of first impression: Is running a hash a Fourth Amendment search? (For background on what a “hash” is and why it matters, see here).

First, the facts. Crist is behind on his rent payments, and his landlord starts to evict him by hiring Sell to remove Crist’s belongings and throw them away. Sell comes across Crist’s computer, and he hands over the computer to his friend Hipple who he knows is looking for a computer. Hipple starts to look through the files, and he comes across child pornography: Hipple freaks out and calls the police. The police then conduct a warrantless forensic examination of the computer:

In the forensic examination, Agent Buckwash used the following procedure. First, Agent Buckwash created an “MD5 hash value” of Crist’s hard drive. An MD5 hash value is a unique alphanumeric representation of the data, a sort of “fingerprint” or “digital DNA.” When creating the hash value, Agent Buckwash used a “software write protect” in order to ensure that “nothing can be written to that hard drive.” Supp. Tr. 88. Next, he ran a virus scan, during which he identified three relatively innocuous viruses. After that, he created an “image,” or exact copy, of all the data on Crist’s hard drive.

Agent Buckwash then opened up the image (not the actual hard drive) in a software program called EnCase, which is the principal tool in the analysis. He explained that EnCase does not access the hard drive in the traditional manner, i.e., through the computer’s operating system. Rather, EnCase “reads the hard drive itself.” Supp. Tr. 102. In other words, it reads every file-bit by bit, cluster by cluster-and creates a index of the files contained on the hard drive. EnCase can, therefore, bypass user-defined passwords, “break down complex file structures for examination,” and recover “deleted” files as long as those files have not been written over. Supp. Tr. 102-03.

Once in EnCase, Agent Buckwash ran a “hash value and signature analysis on all of the files on the hard drive.” Supp. Tr. 89. In doing so, he was able to “ingerprint” each file in the computer. Once he generated hash values of the files, he compared those hash values to the hash values of files that are known or suspected to contain child pornography. Agent Buckwash discovered five videos containing known child pornography. Attachment 5. He discovered 171 videos containing suspected child pornography.

One of the interesting questions here is whether the search that resulted was within the scope of Hipple’s private search; different courts have approached this question differently. But for now the most interesting question is whether running the hash was a Fourth Amendment search. The Court concluded that it was, and that the evidence of child pornography discovered had to be suppressed:

The Government argues that no search occurred in running the EnCase program because the agents “didn’t look at any files, they simply accessed the computer.” 2d Supp. Tr. 16. The Court rejects this view and finds that the “running of hash values” is a search protected by the Fourth Amendment.

Computers are composed of many compartments, among them a “hard drive,” which in turn is composed of many “platters,” or disks. To derive the hash values of Crist’s computer, the Government physically removed the hard drive from the computer, created a duplicate image of the hard drive without physically invading it, and applied the EnCase program to each compartment, disk, file, folder, and bit.2d Supp. Tr. 18-19. By subjecting the entire computer to a hash value analysis-every file, internet history, picture, and “buddy list” became available for Government review. Such examination constitutes a search.

I think this is generally a correct result: See my article Searches and Seizures in a Digital World, 119 Harv. L. Rev. 531 (2005), for the details. Still, given the lack of analysis here it’s somewhat hard to know what to make of the decision. Which stage was the search—the creating the duplicate? The running of the hash? It’s not really clear. I don’t think it matters very much to this case, because the agent who got the positive hit on the hashes didn’t then get a warrant. Instead, he immediately switched over to the EnCase “gallery view” function to see the images, which seems to be to be undoudtedly a search. Still, it’s a really interesting question.

Posted on November 5, 2008 at 8:28 AMView Comments

The Skein Hash Function

NIST is holding a competition to replace the SHA family of hash functions, which have been increasingly under attack. (I wrote about an early NIST hash workshop here.)

Skein is our submission (myself and seven others: Niels Ferguson, Stefan Lucks, Doug Whiting, Mihir Bellare, Tadayoshi Kohno, Jon Callas, and Jesse Walker). Here’s the paper:

Executive Summary

Skein is a new family of cryptographic hash functions. Its design combines speed, security, simplicity, and a great deal of flexibility in a modular package that is easy to analyze.

Skein is fast. Skein-512—our primary proposal—hashes data at 6.1 clock cycles per byte on a 64-bit CPU. This means that on a 3.1 GHz x64 Core 2 Duo CPU, Skein hashes data at 500 MBytes/second per core—almost twice as fast as SHA-512 and three times faster than SHA-256. An optional hash-tree mode speeds up parallelizable implementations even more. Skein is fast for short messages, too; Skein-512 hashes short messages in about 1000 clock cycles.

Skein is secure. Its conservative design is based on the Threefish block cipher. Our current best attack on Threefish-512 is on 25 of 72 rounds, for a safety factor of 2.9. For comparison, at a similar stage in the standardization process, the AES encryption algorithm had an attack on 6 of 10 rounds, for a safety factor of only 1.7. Additionally, Skein has a number of provably secure properties, greatly increasing confidence in the algorithm.

Skein is simple. Using only three primitive operations, the Skein compression function can be easily understood and remembered. The rest of the algorithm is a straightforward iteration of this function.

Skein is flexible. Skein is defined for three different internal state sizes—256 bits, 512 bits, and 1024 bits—and any output size. This allows Skein to be a drop-in replacement for the entire SHA family of hash functions. A completely optional and extendable argument system makes Skein an efficient tool to use for a very large number of functions: a PRNG, a stream cipher, a key derivation function, authentication without the overhead of HMAC, and a personalization capability. All these features can be implemented with very low overhead. Together with the Threefish large-block cipher at Skein core, this design provides a full set of symmetric cryptographic primitives suitable for most modern applications.

Skein is efficient on a variety of platforms, both hardware and software. Skein-512 can be implemented in about 200 bytes of state. Small devices, such as 8-bit smart cards, can implement Skein-256 using about 100 bytes of memory. Larger devices can implement the larger versions of Skein to achieve faster speeds.

Skein was designed by a team of highly experienced cryptographic experts from academia and industry, with expertise in cryptography, security analysis, software, chip design, and implementation of real-world cryptographic systems. This breadth of knowledge allowed them to create a balanced design that works well in all environments.

Here’s source code, text vectors, and the like for Skein. Watch the Skein website for any updates—new code, new results, new implementations, the proofs.

NIST’s deadline is Friday. It seems as if everyone—including many amateurs—is working on a hash function, and I predict that NIST will receive at least 80 submissions. (Compare this to the sixteen NIST submissions received for the AES competition in 1998.) I expect people to start posting their submissions over the weekend. (Ron Rivest already presented MD6 at Crypto in August.) Probably the best place to watch for new hash functions is here; I’ll try to keep a listing of the submissions myself.

The selection process will take around four years. I’ve previously called this sort of thing a cryptographic demolition derby—last one left standing wins—but that’s only half true. Certainly all the groups will spend the next couple of years trying to cryptanalyze each other, but in the end there will be a bunch of unbroken algorithms; NIST will select one based on performance and features.

NIST has stated that the goal of this process is not to choose the best standard but to choose a good standard. I think that’s smart of them; in this process, “best” is the enemy of “good.” My advice is this: immediately sort them based on performance and features. Ask the cryptographic community to focus its attention on the top dozen, rather than spread its attention across all 80—although I also expect that most of the amateur submissions will be rejected by NIST for not being “complete and proper.” Otherwise, people will break the easy ones and the better ones will go unanalyzed.

EDITED TO ADD (10/30): Here is a single website for all information, including cryptanalysis, of all the SHA-3 submissions. A spoke to a reporter who told me that, as of yesterday, NIST had received 30 submissions. And three news articles about Skein.

Posted on October 29, 2008 at 6:35 AMView Comments

Using Google to Crack Hashed Passwords

Clever:

…I thought it would be interesting to find out the account password. WordPress stores raw MD5 hashes in the user database…. As with any respectable hash function, it is believed to be computationally infeasible to discover the input of MD5 from an output. Instead, someone would have to try out all possible inputs until the correct output is discovered.

[…]

Instead, I asked Google. I found, for example, a genealogy page listing people with the surname “Anthony”, and an advert for a house, signing off “Please Call for showing. Thank you, Anthony”. And indeed, the MD5 hash of “Anthony” was the database entry for the attacker. I had discovered his password.

Posted on November 23, 2007 at 6:07 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.