## Skein Collision Competition

Xkcd had a Skein collision competition. The contest is over -- Carnegie Mellon University won, with 384 (out of 1024) mismatched bits -- but it's explained here.

Posted on April 5, 2013 at 6:35 AM • 14 Comments

Xkcd had a Skein collision competition. The contest is over -- Carnegie Mellon University won, with 384 (out of 1024) mismatched bits -- but it's explained here.

Posted on April 5, 2013 at 6:35 AM • 14 Comments

Alexandra • April 5, 2013 7:52 AM

The competition is far to be over, it's ongoing....

Anon • April 5, 2013 8:36 AM

I'm afraid that it explains very little. There is more information on the xkcd forums though.

Personally, I saw a blank image on 1 April at xkcd, and moved on with my life. I didn't realise that I was meant to have JS enabled...

Brian • April 5, 2013 8:48 AM

For those of you who were initially as confused as I was (reading the stackoverflow explanation that it was "like Bitcoin" first was my main problem), what they're looking for is a collision of ANY bits (ie, Hamming distance).

The Bitcoin comment initially made me think they were looking for the longest colliding substring in the hash, making CMU's result kind of astonishing ;) Although finding 640 bits that match anywhere in the hashes isn't too shabby either.

Jake • April 5, 2013 9:07 AM

Is this somehow related to bitcoin "mining"?

Christian • April 5, 2013 11:28 AM

Tried it too ... But randomly choosing 4 english words from a dictionary didn't get me more than the wrong bits guaranteed from randomness.

Gweihir • April 5, 2013 10:39 PM

640 matching bits out of 1024 is actually not that good. Pure random gives you 512. Then you can do hill-climbing, i.e. flip a random bit, see if the results get better. Initially, the result will get better in about 50% of your tries. That will get worse fast if you have more correct bits. If 1 bit does not get you anything, you repeat that with two. And so on. As there will be dead ends and ends where you cannot continue due to effort, you need to cut a branch when you have tried up to a certain number of bits, e.g., 32 bit. Memory management is probably tricky, but basically this is an elementary DFS problem. Doing it BFS is probably not a good idea, as you cannot distinguish the quality of one improvement from another in most cases and it takes much, much more memory.

So it is not like they actually brute-forced 132 bit. Progress gets exponentially (my guess, not tried to prove it) harder with each bit you get. Still an interesting idea though. And xkcd is quite cool!

Roger • April 6, 2013 2:44 AM

@Gweihir:

Hill-climbing doesn't help at all. Any half-decent hash exhibits avalanche, so that a 1 bit input change, on average, changes exactly the same number of output bits as scrapping your guess and trying something else entirely.

Unless you know a weakness in Skein, the only way to proceed is brute force. However, scores on the order of 640 do not require 2^640 work. Because of the way the score is assessed, it will be distributed as the binomial function.

That is, to get a score of around 640 requires effort of around 1/( 0.5^1024 x C(1024, 640)) where C(x, y) is the combinatorial function.

Roger • April 6, 2013 3:03 AM

And incidentally, by Stirling's approximation for the factorial, we can evaluate that as approximately 2^47.

An impressive chunk of work. Not a record-breaker by any means, but a vast amount of computrons (and waste heat, CO2 emissions, etc.) poured down the toilet for an April Fool's joke...

Charles • April 6, 2013 10:22 AM

Did they win by superior mathematics?

Or did they win by being privileged to use CPUs they don't own to burn carbon they didn't pay for at a faster rate than the other competitors?

Could students at Kampala University have competed on equal terms?

My culture describes this sort of stunt as a a WOMBAT. Waste of money, brains and time.

So: What was learned?

Gweihir • April 9, 2013 2:07 AM

@Roger:

The hash itself you cannot climb.

But there has to be some way to tell how many bits are correct, and you can do hill-climbing on that. Otherwise you would either have the full plain-text or nothing of it, but not a certain number of "correct" bits.

eyesoars • April 9, 2013 5:18 PM

@Gweihir:

Roger is right. A good hash should flip about half the bits any time a single bit in the input is changed. And which half of the bits change should be random in the face of different flipped bits.

So you can hill-climb, but it shouldn't be any better than brute force.

Arthur • April 9, 2013 10:10 PM

@Gweihir: The challenge seems to have worked like this:

Randal chooses a secret text P, computes C = H(P), and posts C publicly. He keeps P secret.

Competitors construct and submit texts T such that H(T) shares as many bits as possible with H(P).

The submitted text T will never bear any resemblance to Randal's original P, unless someone manages to guess P exactly. Contrariwise, it is possible for someone to submit a T such that H(T)=H(P) but T≠P, and win the contest outright.

The challenge is **NOT** for competitors to submit texts T such that T shares as many bits as possible with P. That would be silly.

Topo • April 11, 2013 9:59 PM

But what I am still not understanding is how this is feasible at all. Were there any constraints on P (in Author's example)? If not I have to start with an arbitrary T of arbitrary length. It seems unlikely that I would have many bits the same at all.

@Roger: I think you forgot the √(2πn) term, the expected number of hashes for an at-least-384-bit match is 2^50.6.

@Charles: Exactly, the goal of the comic was to produce externalities, both negative (wasting CPU time and vandalizing Wikipedia articles) and positive (donating to Wikimedia). 49,000$ were donated, and a few thousand dollars worth of electricity was wasted.

What was learned? Well, it proved even a stupid contest with no reward can get people to WOMBAT, and it probably gave some incentive to a few people to learn the basics and play with a computing cluster. People from Kampala had no chance to win, but they could have spent a few cents on EC2 to learn how to control a cluster.

@Topo: Yes, it's very unlikely. The number of wrong bits approximately follows a normal distribution of mean 512 and standard deviation 16. 384 bits is 8 standard deviations away! That's why they had to try millions of billions of random hashes before finding such a match.

Subscribe to comments on this entry

Photo of Bruce Schneier by Per Ervland.

Schneier on Security is a personal website. Opinions expressed are not necessarily those of IBM Resilient.

## Leave a comment