Wanted: Skein Hardware Help
As part of NIST’s SHA-3 selection process, people have been implementing the candidate hash functions on a variety of hardware and software platforms. Our team has implemented Skein in Intel’s 32 nm ASIC process, and got some impressive performance results (presentation and paper). Several other groups have implemented Skein in FPGA and ASIC, and have seen significantly poorer performance. We need help understanding why.
For example, a group led by Brian Baldwin at the Claude Shannon Institute for Discrete Mathematics, Coding and Cryptography implemented all the second-round candidates in FPGA (presentation and paper). Skein performance was terrible, but when they checked their code, they found an error. Their corrected performance comparison (presentation and paper) has Skein performing much better and in the top ten.
We suspect that the adders in all the designs may not be properly optimized, although there may be other performance issues. If we can at least identify (or possibly even fix) the slowdowns in the design, it would be very helpful, both for our understanding and for Skein’s hardware profile. Even if we find that the designs are properly optimized, that would also be good to know.
A group at George Mason University led by Kris Gaj implemented all the second-round candidates in FPGA (presentation, paper, and much longer paper). Skein had the worst performance of any of the implementations. We’re looking for someone who can help us understand the design, and determine if it can be improved.
Another group, led by Stefan Tillich at University of Bristol, implemented all the candidates in 180 nm custom ASIC (presentation and paper). Here, Skein is one of the worst performers. We’re looking for someone who can help us understand what this group did.
Three other groups—one led by Patrick Schaumont of Virginia Tech (presentation and paper), another led by Shin’ichiro Matsuo at National Institute of Information and Communications Technology in Japan (presentation and paper), and a third led by Luca Henzen at ETH Zurich (paper with appendix, and conference version)—implemented the SHA-3 candidates. Again, we need help understanding how their Skein performance numbers are so different from ours.
We’re looking for people with FPGA and ASIC skills to work with the Skein team. We don’t have money to pay anyone; co-authorship on a paper (and a Skein polo shirt) is our primary reward. Please send me e-mail if you’re interested.
Grande Mocha • September 1, 2010 1:40 PM
I suspect that you won’t see great performance coming out of FPGA models of Skein. I don’t know of any existing FPGA products which have 64-bit adders as basic building blocks, so the 64-bit adders in Skein are being built from the existing smaller adders plus some FPGA cells needed for the routing and carry propagation.
On the other hand, your ASIC models were probably using Intel’s highly optimized 64-bit adders (one member of the Skein team is an Intel employee, right?), so those numbers are going to be substantially faster than anything that comes from an FPGA.