Bounty to Recover NIST’s Elliptic Curve Seeds

This is a fun challenge:

The NIST elliptic curves that power much of modern cryptography were generated in the late ’90s by hashing seeds provided by the NSA. How were the seeds generated? Rumor has it that they are in turn hashes of English sentences, but the person who picked them, Dr. Jerry Solinas, passed away in early 2023 leaving behind a cryptographic mystery, some conspiracy theories, and an historical password cracking challenge.

So there’s a $12K prize to recover the hash seeds.

Some backstory:

Some of the backstory here (it’s the funniest fucking backstory ever): it’s lately been circulating—though I think this may have been somewhat common knowledge among practitioners, though definitely not to me—that the “random” seeds for the NIST P-curves, generated in the 1990s by Jerry Solinas at NSA, were simply SHA1 hashes of some variation of the string “Give Jerry a raise”.

At the time, the “pass a string through SHA1” thing was meant to increase confidence in the curve seeds; the idea was that SHA1 would destroy any possible structure in the seed, so NSA couldn’t have selected a deliberately weak seed. Of course, NIST/NSA then set about destroying its reputation in the 2000’s, and this explanation wasn’t nearly enough to quell conspiracy theories.

But when Jerry Solinas went back to reconstruct the seeds, so NIST could demonstrate that the seeds really were benign, he found that he’d forgotten the string he used!

If you’re a true conspiracist, you’re certain nobody is going to find a string that generates any of these seeds. On the flip side, if anyone does find them, that’ll be a pretty devastating blow to the theory that the NIST P-curves were maliciously generated—even for people totally unfamiliar with basic curve math.

Note that this is not the constants used in the Dual_EC_PRNG random-number generator that the NSA backdoored. This is something different.

Posted on October 12, 2023 at 7:09 AM26 Comments

Comments

Clive Robinson October 12, 2023 10:09 AM

@ Bruce, ALL,

From the article intro,

“Rumor has it that they are in turn hashes of English sentences”

I would say this is highly probable, because it’s what humans do (as I’ve noted before about the XKCD password system[1]).

One way it happens is you,

1, Specify a system
2, build and test in a spiral
3, Document
4, Tidy up
5, Release.

Note step 2 and it’s implications which is why step 4 is there.

The main implication of step 2 as far,as this goes is,

“Everything has to be reproducable”

So it can be,

“Tested and Evaluated”

So yes an easy to remember throw away phrase like,

“Green ham and eggs make little kids scream”

Which has got only a little over 63bits equivalent entropy gets “hashed” to give a faux 512bits of entropy which gives you a “seed”.

The probability is that it also got hashed more than once, thus increasing the percieved “work factor” without giving any more entropy.

We know we should not do it but it’s a tripple “least path of resistance”,

A, Fast to do.
B, Easy to secure.
3, Easy to reproduce.

Oh so seductively attractive…

The trouble is as we all know “problems arise” and “marketing/managment upgrade the specification” thus “time scales slip”…

So steps 3 and 4 get robbed of time. As step 3 has “easily visable product” cutting time there is a lot harder than step 4…

So the required level of “Tidy up” never happens, and the world continues to spin, all be it with a little more eccentric than before and we move into a more wobbly future.

[1] My first guess would be along the lines of

“This nonsense is above my pay grade so they get what they pay for.”

Hashed 2310 times.

Which would give around 100bits of faux entropy.

Winter October 12, 2023 10:51 AM

@Clive

Which has got only a little over 63bits equivalent entropy gets “hashed” to give a faux 512bits of entropy which gives you a “seed”.

So this should be an easy bounty to win, I presume?

The original source of the hashes has been unable to find the original sentences again, but he might not have looked hard, or he just hides their TRUE origin.

Chelloveck October 12, 2023 12:27 PM

On the flip side, if anyone does find them, that’ll be a pretty devastating blow to the theory that the NIST P-curves were maliciously generated—even for people totally unfamiliar with basic curve math.

I think the original author is giving conspiracy theorists a little too much credit. Why should they accept “proof” that the seeds are benign when they can just invent another layer to the conspiracy? I mean sure, “Give Jerry a raise” hashed 666 times yields a solution, but what if that’s just a coincidence? I mean, hash any phrase enough times and you can generate any seed you want, right? (I don’t know if that’s a true statement, but fortunately that doesn’t matter to the conspiracy theorist.)

iAPX October 12, 2023 12:56 PM

@Chelloveck, ALL

I mean, hash any phrase enough times and you can generate any seed you want, right?

With a perfect hashing algorithm, that doesn’t exist and probably won’t, yes.
But not with SHA1…

Secondly, as Bruce Schneier reported, there is a trust problem with the NSA and the numerous algorithms it backdoored, we may include SHA-0, SHA-1 being a direct derivative.

Thus we couldn’t trust some works where magic numbers are used, without proof that these numbers are really random and doesn’t have any specific chosen properties that may weaken the security.

Winter October 12, 2023 1:35 PM

@Chelloveck

I mean, hash any phrase enough times and you can generate any seed you want, right?

An interesting problem. Ideally, we would like this to be true. That you start with a seed and repeat SHA-1 2^160 times and then come back where you started after visiting every number between 0 and 2^160 – 1.

But that will not work. Starting with any hash and repeating SHA-1 until you get back will give you much smaller rings.

How small?

I seem to remember that this problem is related to the birthday paradox. That the average length of a ring before it returns to the start is the square root of 2^160, ie, 2^80. I do not know the proof (or even whether it is actually true).

Whatever it is, it is not “often” that you can generate a specific hash value from a different plaintext by simply repeating SHA-1 often enough.

Izac Hampton October 12, 2023 1:48 PM

iAPX,

“I mean, hash any phrase enough times and you can generate any seed you want, right?” — With a perfect hashing algorithm, that doesn’t exist and probably won’t, yes. But not with SHA1…

I think that’s true even for a “perfect” hash, whatever that means; and SHA-1 is still only broken in terms of collisions, which are irrelevant for this purpose. The point was better made in the linked Hacker News thread: Jerry could’ve hashed any number of similar statements (or unrelated but meaningful English sentences), and picked the one with whatever hash was best for the NSA’s nefarious purpose—if indeed they had one. Remember that their unexplained “magic numbers” for DES actually fixed a vulnerability, and it was almost 20 years before the public figured that out.

Similar concerns have been raised about “nothing up my sleeve numbers” in general: why’d someone choose 1/π or π/4 rather than π, or π instead of e or ϕ? It was also pointed out in the aforementioned thread, however, that nobody’s come up with a specific and realistic scenario by which the NSA could’ve created a “nobody but us” backdoor in a NIST curve; nor has anyone found any plausible weakness in the curves. Especially now that we’ve got “complete” formulas for dealing with them (“Complete addition formulas for prime order elliptic curves” by Renes et al., 2015).

I don’t have the mathematical knowledge to evaluate such claims either way. I know that Bruce wrote, in 2013, “I no longer trust the constants. I believe the NSA has manipulated them through their relationships with industry.” How has that statement held up over the past decade? Cryptographers have spent the intervening time looking for backdoors or weaknesses in anything involving the NSA or NIST, and we’ve had the new addition formulas for almost as long. To me, the NIST curves aren’t looking quite as bad as they once did; then again, they’re not looking nearly as important either, now that we’ve got Ed25519 and Ed448.

D Curry October 12, 2023 2:19 PM

Bruce, thank you for this blog. I’ve followed it for years and routinely recommend it to young people who aspire for a career in computer security. Could you please be mindful to avoid profanity on the home page.

Izac Hampton October 12, 2023 2:42 PM

By the way, a number is neither random nor non-random. We should be writing and thinking of “random number-generators” rather than “random-number generators”. In other words, it’s the number-selection process that’s randomized, not the numbers.

There’s a real cognitive bias to be avoided here. For example, people think of the number 17 as “more random” than 1, even though they’re equally likely from a d20 (fair 20-sided die numbered with each integer in the interval [1,20]). Alternately, proving that a seed came from hashing an English sentence doesn’t really prove very much because we don’t know why someone chose any particular sentence, even if it doesn’t “look random”. We really need some kind of pre-announcement to prove honesty in such a scenario. These days, we could say “the seed will be the SHA3-512 of Bitcoin block 264,000 followed by whatever Ethereum block is most recent at that time”—one good use for cryptocurrency, right? 20 years ago, we could’ve named 20 popular international newspapers and said the seed would be the SHA-1 of the top headline of the first edition dated with a specific date—in the order they were listed, each encoded in UTF-8 and followed by a single U+000A. That, or we pick the biggest or smallest acceptable number, or agree on a small set of acceptable “nothing up my sleeve numbers” and stop dividing them by small integer or whatever.

Clive Robinson October 12, 2023 3:39 PM

@ Winter, ALL,

Re : On reversing a,”One Way Function”(OWF).

“So this should be an easy bounty to win, I presume?”

In one theory yes, in another theory no. I’ve had this conversation befor on this blog a long time ago with one of the more famous academic posters… And they saw it differently.

The question you have to ask about the entropy is,

“How big is the set to search?”

Of which ever set you chose to search.

Mostly people make the mistake of talking about the output or “visable to the observer set” because that’s what you need to consider for the upper bound “Brute Force Search”

But what if you know the size of the very much smaller input “plaintext set”. How much would the search be reduced?

It can be a very important question and why amoungst other reasons the counter is needed.

To see why you need to understand a very important point,

“No matter how often you apply a determanistic process like a hash even if it is an OWF you do not increase the entropy only the confusion and defusion (See Shanon).

If you look back using “magic pixie dust” as a search term on tgis blog you will see I make this point from time to time. The reason is all basic encryption functions end up creating “A simple Substitution Algorithm” thus are breakable by much simpler methods than “Brut Force Searches” if you can use them.

The example I use is a “Command Line Interface Program”(CLI) that is “line encrypted” by AES with a 256bit key and you can only see the line encrypted output. If you look at the “Brut Force Set” size it’s 2^256 according to many… But it’s not, especially if you know the “Plaintext Set” size it is in reality only {Y,N} at it’s smallest thus has a set size of 2^2 in terms of informarion content or entropy. Thus knowing the way the CLI program works is sufficient to break the system in as little as two key strokes by the user.

And I won’t “name names” but this happened in early “Sharrware” line encryption programs.

The solution is to use the basic encryption algorithm in a “mode” such that a repeated key press will always produce a different “Ciphertext” output set.

In effect you turn the basic encryption algorithm into not just a mixer of the simple substitution cipher, but into a “KeyTex Generator” of a “stream cipher” as well. How you do this does not usually matter a great deal as long as it is a secure method.

Anyway that is why the “counter” variable is there. But… In this case the method of stream generation is important, because we know the initial starting state so we are only doing a simple substitution break on the “plaintext” input set.

Which boils down to a “simple search” on the “plaintext input set” size which is 2^m (crudely, a somewhat reduced message information bit length). Which will be very much reduced than the same on the “ciphertext output set” would be.

JonKnowsNothing October 13, 2023 12:08 AM

@Vuk

re: One way for NSA to find new recruits.

By conscription maybe, otherwise by black-listing…

Whatever Lola wants, Lola gets
And little man, little Lola wants you
Make up your mind to have (up your mind to have)
No regrets (no regrets)
Recline yourself, resign yourself
You’re through

Don’t you know you can’t win?
(Can’t win, you’ll never, never win)
You’re no exception to the rule
I’m irresistible, you fool
Give in (give in you’ll never win)

Give in (give in you’ll never win)
Give in

written by Richard Adler and Jerry Ross for the 1955 musical play Damn Yankees.

===

ht tps://en.wikipedia. o rg/wiki/Conscription

  • state-mandated enlistment of people in a national service

htt ps://en.wikipedia. or g/wiki/Whatever_Lola_Wants

(url fractured)

Winter October 13, 2023 1:16 AM

@Clive

“So this should be an easy bounty to win, I presume?”

In one theory yes, in another theory no.

To summarize your essay, it is not easy to win this bounty.

Clive Robinson October 13, 2023 8:03 AM

@ Winter,

Re : The odds are not good.

“To summarize your essay, it is not easy to win this bounty.”

Actually you might have noticed my explanation was not finished as I did not want the temporal grief of the auto-mod.

But essay hmm… Concepts can be hard to explain if you don’t know what the other person knows, and you want to be inclusive to more than one reader (we see this every time “password/phrase finding/strength and the fun issues with the “XKCD method” arises on this blog).

If I’d simply said that the effort to find the input to output mapping through the One Way Function of the hash involved an unknown search size that can not be quantized, BUT is “vastly smaller than most think”… both you and I therefore know what would happen (with even a “Learned Scholar” not getting it right “in practice” due to “theoretical assumptions”).

The fact that it has some curiosities in that “many inputs map to a single output” (consequence of input set size being unbound but output set size set bound, like an odometer/trip-meter/counter overflowing through zero and up again). That whilst they do improve the probability that a “random search” will actually do better than a “sequential search” actually do not help the practical search (due to the statistics of the input set and it’s generation).

But my conclusion was going to be,

“You can not win, only finish.”

Because the prize money will not be equall to the cost of the search, and be yet another environmental disaster in the making.

But also it’s imponderable like the “Defence Spending” issue.

But to be honest, like our host @Bruce saying he did not trust the NIST Curves, I think,

“The mut in the room –NSA– stinks of it’s own vomit”.

There are shall we say cleaner ‘cur ves’ in other more modern rooms the ‘Inquisición española’ can play with.

Winter October 13, 2023 8:35 AM

@Clive

Xkcd password “trick”

The problem with the idea of Randall was that people automated it with a fixed vocabulary. However, it works on any person’s mental lexicon which does not completely overlap with the official dictionary, or even any dictionary.

If someone selects a seed phrase from personal experience, like Give Jerry a raise, then all odds are off trying to get the probabilities, and hence, entropy. If you do a dictionary search trying to find words that are not in your dictionary, that will fail. And everyone uses some words that have not appeared in print.

Clive Robinson October 13, 2023 9:32 AM

@ Winter, ALL,

Re : XKCD Method issues.

“The problem with the idea of Randall was that people automated it with a fixed vocabulary.”

That’s not the problem with it, the problem is actually,

“The failings of the human mind”.

The words in the XKCD dictionaries should be regarded as “letters in it’s alphabet” as that is how they are actually being used.

As we know “random character passwords” fill the potential “password space” any given character string width provides. However human memorable character strings represent a tiny tiny tiny fraction of that “password space”.

So on being presented with the output word list of the XKCD Method generator many humans will,

1, Reject and click again.
2, Rearange the words.
3, Do both.

The result is that the actual number of passwords/phrases is much much smaller.

To see why write down the hundred numbers in two digit format from “00 … 99”. Then remove those with double digits, then reorder the digits in numeral order so 73 becomes 37 and then remove the duplicates… How many of the original hundred survive?

Now when you get a list of five of those remaining numbers randomly selected, again remove dupicates and re-order in numerical order, how many can there be out of the potential of 10,000,000,000?

Winter October 13, 2023 9:44 AM

@Clive

The words in the XKCD dictionaries should be regarded as “letters in it’s alphabet” as that is how they are actually being used.

But do you know my alphabet? Do you know my mental lexicon? If I select words not in the (a) dictionary, how do you guess them?

And, a word in English is not the same as a word in German or Finnish.
Donaudampfschifffahrtskapitänsmütze is an actual single German word.

Winter October 13, 2023 9:50 AM

Continued

@Clive

For more examples from the Donaudampfschiffahrtsgesellschaft see Wikipedia:

‘https://de.wikipedia.org/wiki/Donaudampfschiffahrtsgesellschaftskapit%C3%A4n

Clive Robinson October 13, 2023 1:20 PM

@ Winter,

Re : Password space and guessing.

“But do you know my alphabet? Do you know my mental lexicon? If I select words not in the (a) dictionary, how do you guess them?”

Well… The assumption used for a century or so is,

The enemy knows the system

Which brings around the curious question of,

“In the general or specific?”

If you think about it you are arguing in the “specific” but the XKCD system is in the general.

Think about your argument as being that of “the pig-pen symbols” -v- “the dancing men” symbols, or even Tolkien’s ‘Cirth’ runes symbols etc. It’s just a way of converting a bit string into a visual representation. As such it does not change the fundemental maths of the system.

Which is,

1, Randomly generate a string of bits.
2, Break up into a list of integers.
3, Use integers as pointers into a fixed sized ordered list (dictionary).
4, Output the list entries so as to preserve,
4.1 the number of bits,
4.2 the states of the bits,
4.3 the order the bits are in,
of the random generation output.

You are arguing about the translation of those integers to glyphs or equivalent which is a translation issue, that does or atleast should not effect the output of the random generator, or basic steps of the method.

I’m arguing that with regards the
XKCD Method humans due to their own mental limitations are munging things so the all important points in step four get reduced or even negated such that in effect the password/phrase space is drastically reduced.

Which is the first step of realising that if given any freedom most humans impose their own limitations on the underlying systems and in the process reduce the set size.

When that is accepted, it can from the attackers perspective be realised into an optimisation of the attack thus readducing not just the search space, but the resources required.

Which is what many successfull password breaking systems do these days.

That is they go for the neck of the weak link which is “human failing”.

So in theory if the stories are true the users recorded language and behaviour can be analyzed and thus used to produce an optimisation stratagem that gets applied first.

Will it guarantee a faster solution, or even a correct solution?

No, but based on the way passwords/phrases fall as fast as they do these days under these methods it probably will.

Clive Robinson October 13, 2023 1:38 PM

@ ALL,

I suspect many have forgotten the “back story” behind the FaceBook Tor “hidden service” name of,

facebookwkhpilnemxj7asaniu7vnjjbiltxjqhye3mhbshg7kx5tfyd.onion

And what was involved with just getting the leading “facebook” in the name…

Well it’s effectively what winning this prize is going to require you to do, so,

https://en.m.wikipedia.org/wiki/Facebook_onion_address

Jarless Irony October 13, 2023 1:40 PM

“But when Jerry Solinas went back to reconstruct the seeds, so NIST could demonstrate that the seeds really were benign, he found that he’d forgotten the string he used.”

Come now, do you really expect us to believe that a) the NSA expert in charge of generating the NIST seeds didn’t document the generation proccess somewhere, and b) the NSA, with all it’s skill, background knowledge about Dr. Solinas, with 25 years of time and computing power, has not been able to recover them?

More likely they know but won’t say. Perhaps testing to see if and who manages to crack the puzzle.

lurker October 13, 2023 11:11 PM

@Jarless

I’ve got 2¢ says Jerry played dumb just to prove he was human like us, and his bosses think that’s a good game. After all a $12k bet is peanuts to them if it gets them free labour to show the curves are currently uncrackable.

Winter October 14, 2023 5:01 AM

@Clive

Think about your argument as being that of “the pig-pen symbols” -v- “the dancing men” symbols, or even Tolkien’s ‘Cirth’ runes symbols etc. It’s just a way of converting a bit string into a visual representation. As such it does not change the fundemental maths of the system.

And that fundamental math explains why there are still undeciphered writing systems [1]. To decipher a writing system you need a shared language with shared words.

Your attacks all are based on a comprehensive list of words. No such list is even possible as new words appear every day. Every family and every neighborhood have their own words and names.

If you can convince people to use a fixed vocabulary, all passphrases can be found by brute force. If you go outside such a vocabulary, and you have to brute force individual words and phrases, things get more time consuming.

Finding the seed phrase in this bounty costs too much, you say. But if a 5 word phrase indeed has around ~50 bits of entropy, that should not take long and not cost that much.

[1] ‘https://en.m.wikipedia.org/wiki/Undeciphered_writing_systems

denton scratch October 14, 2023 5:51 AM

@Isac Hampton

Jerry could’ve hashed any number of similar statements (or unrelated but meaningful English sentences), and picked the one with whatever hash was best for the NSA’s nefarious purpose

If you know what number the boss was hoping for, why bother hashing sentences? Why not just jump directly to the answer?

I suppose the theory must be that the sentence-choosing and hashing amounts to a “Look, nothing up my sleeves” manoeuvre. BUT THAT WON’T WASH IF YOU FORGET THE BLOODY SENTENCE.

Izac Hampton October 14, 2023 12:15 PM

denton scratch,

If you know what number the boss was hoping for, why bother hashing sentences? Why not just jump directly to the answer?

Quoting Filippo: “The NIST elliptic curves that power much of modern cryptography were generated in the late [’90s] by hashing seeds provided by the NSA.” And tptacek from Hacker News: “At the time, the ‘pass a string through SHA1’ thing was meant to increase confidence in the curve seeds; the idea was that SHA1 would destroy any possible structure in the seed, so NSA couldn’t have selected a deliberately weak seed. … But when Jerry Solinas went back to reconstruct the seeds, so NIST could demonstrate that the seeds really were benign, he found that he’d forgotten the string he used!”

In other words: we hash an English sentence to get a seed, then hash the seed to get a curve. And we know the seeds, so if we assume (reasonably, I think) that the NSA didn’t have a usable pre-image attack for SHA-1 twenty-six years ago, they couldn’t have gone directly to a specific curve. They could’ve instead told us “here are the curves” without explanation, but the published seeds were indeed part of a “‘Look, nothing up my sleeves’ manoeuvre” as you say. Note that we were only ever told “on the record” that the seeds were generated “verifiably at random” (though evidently not “verifiably” by us); the “English sentences” thing is a rumour.

The most realistic remaining risk is that the NSA knows a class of weak curves of which the public is unaware. Perhaps one in a trillion curves is weak in a way the NSA finds useful. That may seem like a slim chance, but to a cryptographer it’s only a 40-bit search space. The NSA would “just” need to randomly generate about a trillion English sentences—or a trillion integers if you doubt Jerry’s story—to find one. The problem with this conspiracy theory is that it doesn’t produce a “nobody but us” backdoor. The US government could never use such a curve for “classified” data, because they’d expect one of the NSA’s unfriendly foreign counterparts to crack it quickly. Likewise, the government wouldn’t have used SHA-1 if they’d known of a pre-image attack, and by now we’d presumably know.

The official story is that all this stuff was approved for, and used for, protecting “Top Secret” communications. And, though cryptographers are rightfully suspicious of unexplained constants, I’m not aware of any having documented any type of weakness or backdoor that such an agency might have looked for; few people believe these agencies are twenty-six years ahead of the public. (By contrast, look up the public reverse-engineering of the Streebog and DES S-boxes. There’s strong evidence they were specially crafted.)

I’m treating this as a “just for fun” competition. Whatever the result, some people will remain suspicious, and that’s fine since internet standards now have good alternate curves available. (Personally, I think the post-quantum stuff is a better target of our ire. Really, NIST doesn’t want hybrid modes, contrary to basically everyone else and given the types of failures we saw during the late stages of that competition?)

Clive Robinson October 14, 2023 4:50 PM

@ Izac Hampton, ALL,

Re : PQC and long poles

“Personally, I think the post-quantum stuff is a better target of our ire. Really, NIST doesn’t want hybrid modes, contrary to basically everyone else and given the types of failures we saw during the late stages of that competition?”

It’s not just hybrids of “pre and post” QC crypto they do not want.

They also do not want hybrids of either “pre and pre”, or “post and post” algorithms either.

Which kind of makes me more than suspicious, thus sniff for the rodent.

Especially as we know that the NSA fritzed with the NIST AES competition, with the result we had an algorithm “strong in theory” but very “fragile in practice” when incautiously/unknowingly implemented. Which had many side channels leaking information to the network[1] (way more than Eliza’s Bucket). Thus it was in the “fast software” that the competition rules made freely available from the NIST site, that quickly ended up in code libraries thus in many implementations some of which are still in use…

The NSA as NIST’s legaly required crypto advisors were well aware of the side channel issue and should have prevented it.

Instead the NSA approved AES for “secret” but only for “Data at rest” thus revealing they knew darn well it had significant side channel issues.

[1] Such time based side channels were an old game to the NSA back in the 1980’s Peter Wright wrote about an accoustic side channel that was actually a time based side channel in his Spy Catcher book that MI5 gave to GCHQ who passed it back to the NSA. But exploiting such channels goes back long before the NSA or GCHQ existed, using an oscilloscope on a “telex line” to strip machine generated “super encipherment” was known before WWII. Look up Benjamin ‘Pat’ Bayly and why he developed the Rockex for the UK DWS and had to modify it several times (TEMPEST issues of a similar nature).

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.