Cryptanalysis of SIMON-32/64

A weird paper was posted on the Cryptology ePrint Archive (working link is via the Wayback Machine), claiming an attack against the NSA-designed cipher SIMON. You can read some commentary about it here. Basically, the authors claimed an attack so devastating that they would only publish a zero-knowledge proof of their attack. Which they didn't. Nor did they publish anything else of interest, near as I can tell.

The paper has since been deleted from the ePrint Archive, which feels like the correct decision on someone's part.

Posted on May 14, 2019 at 6:11 AM • 18 Comments

Comments

arfanarfMay 14, 2019 8:02 AM

If they didn't publish anything of interest then why are you drawing attention to this paper?

Have I missed some significance?

Natanael LMay 14, 2019 10:28 AM

@arfanarf, most likely because it already managed to draw some attention. People have probably already asked him about it as well. So if anybody wants to know Schneier's opinion on this paper on this controversial NSA cipher, here it is.

VenMay 14, 2019 10:28 AM

@arfanarf Possibly because it's not a real attack on SIMON-32/64, but it is a real attack on the Security, Cryptographic, and Academic communities?

TIARA GNOMEMay 14, 2019 11:35 AM

It's a silly stunt.

On page 9 they use a Hebrew name for God, YHWH, and the given names of the authors are a mix of Matthew, Mark, Luke, and John.

Saying thank you to a certain Mr. Popov makes one think of Popov Vodka.

WhiskerInMenloMay 14, 2019 2:16 PM

“It feels like the correct decision....”

Does this imply that it is real, is a fake, agents of doom knocked on doors..?

Obfuscated names or pseudonym make it hard to just ask but if this is true the authors could have been tracked down and silenced. Nice cash, or a hole in a peat bog... or legal gag orders.

There is enough idle computing power in a parked Tesla to make this interesting.

Quantum machines and the interesting life of data are on a collision course faster than 12 years. Encrypt your data well...

Bruce SchneierMay 14, 2019 2:24 PM

@arfanarf

"If they didn't publish anything of interest then why are you drawing attention to this paper?"

I thought about this, and decided that it was worth commenting on. I also linked to the paper from the Wayback Machine, in case people wanted to examine it themselves.

WeatherMay 14, 2019 3:00 PM

They know a ,Y, from left to bottom they brute force a Value, but the right ,round keys, isn't worked out,

Is there anywhere for computer code?

Jeff EplerMay 14, 2019 4:22 PM

I show one way that the authors could have generated "Table 6" without having any interesting cryptanalysis of SIMON. I've posted a program on my blog that can generate "Table 6" rows at a rate of hundreds or thousands per minute on any modest computer.

A terrible troll of a paper, but it was a fun exercise to write this small program.

ThunderbirdMay 14, 2019 5:57 PM

Mr. Epler, when I read what they claimed they were doing (their block "Algorithm 1"), they said they were choosing two blocks of 8 contiguous bytes, not two blocks each the concatenation of two blocks of 4 contiguous bytes. Looking at their examples, most of them look like they actually would be contiguous in the text. Clearly the time complexity of that would be much greater than the examples you posted.

Am I misunderstanding something? Are the examples they posted not actually contiguous text from the source document?

Clive RobinsonMay 14, 2019 6:37 PM

@ arfanarf,

If they didn't publish anything of interest then why are you drawing attention to this paper?

There is another aspect that has gone unmentioned on this thread but has been mentioned before on this blog. SIMON is a "hardware" block cipher that is at best suspect for a number of reasons including it is known to have to small a security margin to be considered even remotely "future proof". As a hardware engineer I can see that there are "side channel" leakage issues kikely to arise by those "not sufficiently practiced in the security arts"... A stunt the NSA pulled back with the original AES competition. Worse various tricks were tried to get SIMON / SPECK it into various standards, most unsuccessfully. All of which makes the normally cautious open community, down right suspicious.

This means there could be other reasons than those that have been suggested above. Not just for SIMON but for it's sister "software" block algorithm SPECK.

It is possible SIMONS designers, their employers (NSA) or those close to them are running what is in effect a disinformation campaign as part of a "finessing strategy". That is push out a number of nonsense claims it's weak and disprove them, as well as claims no weaknesses have been found. So a faux "trial by fire" all done before trying to push through SIMON or Son-of-SIMON through a standards process.

You can read more at,

https://en.m.wikipedia.org/wiki/Simon_(cipher)

Jeff EplerMay 14, 2019 7:25 PM

@Thunderbird,

That's a fair question.

I made my implementation choice because the text is unclear, but (A) they talk about blocks of the form Pi',Pi'. My intuition was that there were so few such 8-byte blocks that they must mean were drawing 4 bytes at a time, and (B) some of the blocks such as "LORDhard", don't exist in the pg10.txt I have access to (even after removing all CRLF characters, in case it existed across a line break), but appear to be two 4-byte blocks mashed together. ("The LORD hardened Pharoah's heart" does appear, but not "LORDhard"; other examples show there was not a general removal of whitespace) (C) A lot of blocks don't exist under either interpretation because they contain characters like "{" and sequences like "[him" which are not in pg10.txt at all, so I can play whatever game I like.

(A) depending how you process and normalize pg10.txt there are perhaps 46 out of 893792 8-grams that are repeated 4-grams. This is 1 in 19,430 so it might be a situation that arises when (as they claim) they did N=1000 attempts of Algorithm 1, but probably not so frequently that they couldn't just ignore such results as a post-processing step.

* "process and normalize" -- It's a CRLF text file but no CRs or LFs appear in their chosen texts. But there are, say, about 11x as many LF characters as "L"s. And yet here we are with "LORDhard".

If you "really" want 8-byte sequences, then my method will admittedly take quite a bit longer. However, Thomas Pornin on stack exchange estimates a rate of 1 key per 2 days at 2^25 encryptions per second.

fishMay 19, 2019 9:51 PM

@TIARA GNOME
> Saying thank you to a certain Mr. Popov makes one think of Popov Vodka.

They thank Oleg Popov, who just so happens to be a professional circus clown.

oilulioMay 25, 2019 3:21 AM

In terms of the lack of matches with the Project Gutenberg reference cited, using instead the text extracted from this PDF, dropping the preface, and replacing line breaks with nulls, I get an exact match for 31 of the 36 eight character sequences shown in the paper, including “LORDhard” which results from a line break compaction. There are 1,175,390 unique sequences.

Specifically that text includes curly brackets around chapter and verse numbering e.g. {2:23} (unlike the cited Project Gutenberg version), and therefore matches the sequences in Table 6 that the cited version could not. It also includes square brackets, again unlike the cited version. Crucially all the sequences shown in the paper with square brackets match exactly. The five non matches have logical corruptions that could be explained by different line breaks – e.g. “k uponhi” does not match, but “k upon hi” would, perhaps as part of “look upon him”, which is in the text.

Were there to be a large number of ‘mutated’ sequences, it might have been supposed that the authors were increasing their chances of matches by including extra sequences, e.g. those both with and without spaces. But for 31 true matches out of 36, this effect would have been small at best and it seems more likely they had obtained the same text with slightly different line breaks.

So, poor referencing is indicated – but nothing that would materially alter the result.

More interestingly there are patterns in the keys found and shown in Table 6. The most obvious pattern is “0x000? ?... .... ....” which is shared by four keys, and where the dots are either 0,1,2,3 i.e. just the lowest two bits in the nibble. This is suggestive of the idea that the authors have searched over a small search space, deliberately spread out to ensure that the keys are not mostly zero, which would make the small search space obvious. Of the 11 dots, there are therefore only 22 bits of entropy in that part of the key rather than the 44 available.

Other keys are grouped in a subtler way, but employing the same principles. Five keys share the base pattern “0x???? ?C40 80C4 C804”. Note these nibbles have only their top two bits set. The specific keys, as before, are formed by only then varying the lowest two bits in each nibble (as well as setting the ‘?’ to some value). There are several groups like this, as well as one key that seems unrelated.

oilulioMay 25, 2019 3:24 AM

Reference PDF in last post should have been www.lambfold.org.uk/sitebuildercontent/sitebuilderfiles/kjvbible.pdf

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Sidebar photo of Bruce Schneier by Joe MacInnis.