Bruce Schneier | |||||||||||||||
Schneier on SecurityA blog covering security and security technology. « Social Security Numbers are Not Random | Main | Friday Squid Blogging: Humboldt Squid Invasion » July 24, 2009SHA-3 Second Round Candidates AnnouncedNIST has announced the 14 SHA-3 candidates that have advanced to the second round: BLAKE, Blue Midnight Wish, CubeHash, ECHO, Fugue, Grøstl, Hamsi, JH, Keccak, Luffa, Shabal, SHAvite-3, SIMD, and Skein. In February, I chose my favorites: Arirang, BLAKE, Blue Midnight Wish, ECHO, Grøstl, Keccak, LANE, Shabal, and Skein. Of the ones NIST eventually chose, I am most surprised to see CubeHash and most surprised not to see LANE. Here's my 2008 essay on SHA-3. Here's NIST's SHA-3 page. And here's the page on my own submission, Skein. Posted on July 24, 2009 at 12:15 PM • 19 Comments To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter. Who next will be voted off the island? Stay tune on for the next episode of Schneiervor. Posted by: Stephen Smoogen at July 24, 2009 12:47 PM I am also surprised that LANE is not on the list. I thought that the LANE submission documentation was very thorough and contained excellent analysis. Perhaps NIST felt that they had enough AES-based submissions, though. Overall, this seems to be a good choice of second round candidates. Congratulations, Skein Team, on making it to the next round! Posted by: Grande Mocha at July 24, 2009 12:58 PM One may speculate that NIST saw 3 solid AES-based submissions (ECHO, LANE, Grøstl) with very similar performance characteristics and security analysis, and that one of them was dropped more or less arbitrarily. A pity. I think LANE would have made a fine standard. At this point I guess it's time to start placing bets for top 5. My gut feeling says that unless breakthroughs are made, Skein, CubeHash, Grøstl and Keccak will be with us also in a year. Posted by: Ghoti at July 24, 2009 1:14 PM Guess I should dig out our altivec improved skein, it made a difference on 32-bit powerpc. Posted by: Johannes Berg at July 24, 2009 3:36 PM Ah, it is public, just not cleaned up enough yet. http://git.sipsolutions.net/skein-altivec.git/ Posted by: Johannes Berg at July 24, 2009 3:39 PM Re: CubeHash, djb mentioned something about CubeHash8/32 at the end of his slides at one of the conferences, right? He's allowed to tweak the algorithm until September 15, so he might make it a more realistic competitor by lowering the insane safety factor. I'll repeat my earnest wish to see a tree mode of Skein as the default. The next generation of Intel processors will have 256-bit SIMD, meaning four Threefish encryptions at once, per core. Ain't nothin' to sneeze at. Posted by: Randall at July 24, 2009 5:56 PM But wouldn't the overhead of splitting up the job among four processors swallow up most or all the advantage of getting them going at once? It seems like the parallel processing advantages of tree hashes only starts paying off when you have huge messages and large numbers of processors. Posted by: memory at July 24, 2009 8:17 PM @memory: with a SIMD instruction set like SSE2, you don't deal with multiple processors. One processor performs an identical task on multiple chunks of data in parallel: "XOR these four words onto these four" or "add these four words to these four." It's used in the real world: in the Crypto++ implementation of Salsa20, for instance, and in some SHA-3 candidates including CubeHash and SIMD. I sent the Skein team some code implementing Skein in SSE2 that does two Threefish encryptions at once and throws one away; it takes about 20 cpb. (That's much slower than their best 64-bit implementation (6 cpb), but faster than other 32-bit implementations, which is why it's useful. (Also, I owe the Skein team some cleanup on that code...)) If my code weren't throwing the results of one encryption away, it would be getting 10 cpb on 32-bit machines. You could exploit that speed in a tree hash or when, say, using Threefish to encrypt in counter mode. If you scale that up to 256-bit SIMD in Intel's next-gen processors and do four encryptions at once, then you'd be getting 5 cpb. On the other hand, those processors might have other improvements that make the non-SIMD implementation of Skein faster. GPUs go way beyond 256-bit SIMD. Intel's Larrabee GPU (due next year? vaporware?) is supposed to have 512-bit SIMD in each core -- eight encryptions at once -- and existing GPUs can execute the exact same code on many sets of registers at once really efficiently. I wouldn't assume that GPGPU or far-future multicore speedups are all hogwash -- people will continue to want to process data more quickly and clock rates aren't going to keep on doubling forever, so we'll eventually figure out how to make that kind of parallel computation more practical. But SIMD is something that you can use *now*. (Parallel ASIC implementations are also a short-term practical win.) If nothing else, it's an easy way to nearly double Skein's speed under 32-bit OSes! Posted by: Randall at July 24, 2009 10:47 PM @memory: Some key qualifications to what I just said. 1) SSE2 Skein would be slower than the current 64-bit implementation of Skein, even for a parallelized Skein. That's because SSE2 tragically lacks a rotate instruction. That's not guaranteed to change w/the next gen of processors and their larger SIMD register sizes. 2) Larrabee and GPGPU are promising but unproven. 3) 5 cpb is a common-sense estimate; these future processors don't exist so I can't benchmark. Still. Tree mode: interesting. Posted by: Randall at July 24, 2009 11:00 PM @Randall - very interesting; got a blog or article somewhere in more detail for us SIMD newbies to think about? Posted by: will at July 25, 2009 2:00 AM @will: Nothing of my own, but here are some links: Intro: Reference: Fun speculation on SIMD and the future: Relevant Wikipedia articles: Some software using SSE2: SHA-3 and SIMD: Posted by: Randall at July 25, 2009 10:55 PM @Randall AMD's version of AVX, the 256-bit wide successor of Intel's SSE[X], called XOP, will have SIMD rotations. Posted by: dongs at July 26, 2009 5:55 PM @dongs: Looks like rotations are only there for the 128-bit registers -- see the "Shift/rotate..." bullet point at http://forums.amd.com/devblog/blogpost.cfm?... . (Would be awesome if there were rotates on 256-bit regs, though. If you see something more specific than that post, would love to hear.) Still, the current SSE2 Skein/CubeHash spend a lot of time rotating, so 128-bit registers with rotates may be faster than 256-bit registers without. Posted by: Randall at July 26, 2009 11:04 PM @moderator: I submitted a big old list of links for @will as a comment, and MT probably blocked it as spam. Since will asked and it's relevant (has to do with crypto performance, which is important to SHA-3), I'd love to see it unblocked. Or e-mail me and I'll put up a post on Blogger somewhere. Posted by: Randall at July 26, 2009 11:06 PM It appears you're right, it is 128-bit only. On a related note, I've written and experimented a little with Threefish a few months back (it's in portuguese): http://blol.org/932-threefish-part-ii In 32 bit x86, SSE2 helps a lot with the 64 additions. The end result was 256bit key/block Threefish at ~9 cpb in CTR mode. Posted by: dongs at July 28, 2009 5:49 PM There is any one made the Threefish Encryption and Decryption algorithms in VHDL?? Posted by: Anas at February 5, 2010 1:11 PM Subscribe to comments on this entry Post a comment
Powered by Movable Type. Photo at top by Geoffrey Stone.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT. |
|
Comments