The Security of the Fortuna PRNG

Providing random numbers on computers can be very difficult. Back in 2003, Niels Ferguson and I designed Fortuna as a secure PRNG. Particularly important is how it collects entropy from various processes on the computer and mixes them all together.

While Fortuna is widely used, there hadn't been any real analysis of the system. This has now changed. A new paper by Yevgeniy Dodis, Adi Shamir, Noah Stephens-Davidowitz, and Daniel Wichs provides some theoretical modeling for entropy collection and PRNG. They analyze Fortuna and find it good but not optimal, and then provide their own optimal system.

Excellent, and long-needed, research.

Posted on March 11, 2014 at 6:28 AM • 35 Comments

Error?March 11, 2014 7:52 AM

Good morning Bruce. The Tags link to Fortuna isn't a link. Or maybe this isn't an error?

Bruce, any thoughts on their use of AES instead of a secure hash function with a larger state (and therefore larger pools) as you chose in the original Fortuna design?

IANA cryptographer, but using a PRF with only 2^128 states that never repeat just *feels* wrong in a cryptographic PRNG, no matter how often it is re-keyed. Isn't such a construction inherently biased by the fact that outputs cannot repeat over short intervals?

To overcome, you can of course only use say 64-bits of each AES output, but then you've halved any throughput gains made by using AES in the first place.

wumpusMarch 11, 2014 9:57 AM

@Ryan.
While I think Fortuna was probably designed before this was important, current Intel (and AMD) chips can do AES in hardware. Getting SHA256 in hardware currently involves buying bitcoin mining hardware, so expect a price premium. I'd expect using the first 64 bits with SSE3 (whichever includes AES) to be more effective, even if this means doing the operation twice (or four times and be fed by an additional 64 bits from a separate AES buffer).

I admit, I'm rather surprised at the use of SHA256 in the entropy pools and AES as the "main hash". Sounds like multiple points of failure (mainly chances to get the algorithm wrong). I can only assume that SHA256 wasn't fast enough when fortuna was originally designed (and even if it is now, AES has a sufficient speed advantage for the output).

Note that the paper seemed to assume a source of perfect randomness. Even before the Intel Hardware RNG scandal, this was likely the tricky (and untestable/unprovable) part of the design.

AnuraMarch 11, 2014 12:01 PM

@Ryan

Block ciphers in counter mode have been well studied, and they should provide sufficient randomness as long as there are no distinguishing attacks.

That said, I'd love to see Keccak analyzed for this purpose; it seems like sponge functions are ideal for this task.

http://sponge.noekeon.org/SpongePRNG.pdf

why the heck isn't anyone talking about making random number generators from quantum noise? random.org used fm radio static into a sound card for years. Geiger counters with a small radioactive source (potassium is radioactive - salt substitute!) are inexpensive too. makes me think the egos of software designer's are more interested in solving a software problem than implementing a cheap, easy physical solution that actually would work!

AnuraMarch 11, 2014 12:11 PM

@matt

Systems like Fortuna are about collecting available entropy on existing systems without adding new hardware to do so. It has nothing to do with egos, it has to do with the reality that pretty much everyone needs quality random numbers available on their system, and not everyone is going to buy specialized hardware random number generators.

matt is dumbMarch 11, 2014 12:13 PM

why the heck isn't anyone talking about making quantum noise generators to trick quantum noise based random number generators into delivering predictable results? people have had FM transmitters for years and people still trust that silly hack. makes me think the egos of matt's are more interested in "solving" a problem than actually solving the problem.

MacarthurMarch 11, 2014 12:45 PM

I'm really glad that you summarized it Mr. Schneier, because reading over the paper I could barely grasp all of the insanely high number of mathematical symbols. I hope that someone who understand sit all can update fortuna and make it more robust.

P.S. I tried to understand it, but honestly way too many crazy high-level stuff in it. I was glad that it wasn't that weird two-column system that I see in a lot of papers but that math I don't know.

wumpusMarch 11, 2014 2:28 PM

Matt,
The hardware generator in Intel processors uses thermal noise which should be roughly as good as your "quantum noise". At current design rules, they may even qualify as "quantum noise" (some measurable bits' worth, at least). I'd expect even more specialized chips to be far less trustworthy.

I'll take drive head travel and other proven oldies, although I'm not sure why it isn't more common to put a microphone near a fan since most computers have microphone inputs built in (check the fft for randomness, then hash into the needed chunks).

shapingMarch 11, 2014 3:10 PM

RE: matt/matt is dumb

Both these characters could be NSA sock puppets aimed at undermining confidence in the use of physical random number generators as such.

The one thing the NSA fears more than anything else is a crypto system that they can't even in principle crack like one-time pad.

So they emphasize all the defects in such a system while downplaying the advantages.

This is where cryptography meets psychological shaping operations.

The goal is to tilt the future development of cryptographic systems in their favor like they did with the BULLRUN program.

Jonathan WilsonMarch 11, 2014 4:29 PM

I am surprised that we haven't seen support for SHAx in the latest CPUs given how widely its used.

Snarki, child of LokiMarch 11, 2014 6:40 PM

Matt,
I suggest you give physical RNG a try. What you'll find is that it gives you a reasonable, but not perfect, source of entropy. So you still have to do the cryptographic post-processing to get good random numbers.

There are those who have tried physical random number generators, and those who advocate them. Those sets are nearly 100% disjoint.

I was considering using my own block cipher to make a PRNG, but I need more people to look at it first. It's 512/32 round and I'm going to put it on Sourceforge soon.

Nope, not a sock puppet here. Semi-competent grouch?

After Fortuna has allowed the system to boot, given the low cost and high quality of physical RNG numbers, why would you want to use a pseudo RNG at all the rest of the time?

In a thousand years, can you guarantee that any pseudo RNG algorithm won't be weakened or broken? Nope. A $12 fm radio has true perfect forward secrecy, and gets rid of a large chunk of very complicated code. It would not be hard for an EE to design one out of discrete components (apart from the USB or PCI interface chips) thereby avoiding compromise of the hardware, like the fuss over the intel motherboard random number generator chip. Reverse biased diode, amp, A/D converter, interface. Thank you for all of your replies and explanations. Sorry about getting trollish earlier. Carl 'SAI' MitchellMarch 11, 2014 8:11 PM @matt The issue is you still need to normalize (whiten) the outputs of the TRNG. Physical generators invariably have biases, and cryptographic uses of random numbers don't tolerate such biases. So you use the TRNG to seed an entropy pool for a CSPRNG. Also, many physical sources open up an extra channel by which to attack the generator. Designing a hardware random number generator isn't that hard, making it work well takes some actual effort, and making it secure on its own is quite complex. A$12 FM radio is vulnerable to someone with a $20 FM transmitter. @Chris Abbott Your qualifications are what? You've broken which algorithms? Given how difficult making a secure cypher is don't expect anyone to spend even 30 seconds looking over it unless you can show you are qualified to be producing such a thing. It takes a huge amount of effort to evaluate a cryptographic algorithm properly (years of work,) and so no one will do that for free, for someone with no significant history of being a good cryptanalyst. Nick PMarch 11, 2014 8:54 PM Two nice Ritter surveys on TRNG's. http://www.ciphersbyritter.com/RES/NOISE.HTM http://www.ciphersbyritter.com/RES/RNGMACH.HTM Geiger counter based designs are mentioned in one of the links. Here is one that specifically warns that it picks up background radiation with$199 serial port and \$249 USB models. Connect it's measurements to an entropy pool whose hash feeds a CRNG.

People wanting a software solution without external device can always consider something like Haveged or DakaRND. These convert hardware timing fluctuations into random numbers. This type of generator's actual security isn't well-understood, though.

The coolest one at one point was LavaRNG. They aimed a camera at a lava lamp to provide truly random input to feed a post-processing phase. They later got rid of a lava lamp using a camera and a box. Or something like that.

I just used dice and well-shuffled cards to produce enough entropy for a seed. I manually entered this into a program that produced a hashed master secret from it. This was fed into a CRNG to produce the actual random numbers (or other CRNG's seed keys). An updated version gathered every bit of data, esp unique data, it could on my system into another hash image. There's usually also timing (a la havage) and network sniff data fed into another. These were combined with the master secret using hash functions, with the function run a number of times between each step. Sometimes I used different hash functions.

A previous posting here I pointed out that the easiest method is a specific, hardened, custom device for producing RNG's. The devices produces them and they're sent to the other systems. Transport might be a safe physical media, over a network, and so on. If other nodes use good CRNG, the RNG appliance might not even need to generate much. Entropy pool updates over a network can also be automated to happen periodically and via a one-way cable so master node isn't compromised. Admin just receives reports from nodes that they received the data, maybe with the hash. Admin also gets a report from RNG appliance which should match.

Note: There's now a tool available that does a variation of what mine did.

Network taps themselves can also make for good entropy sources if your network has a mix of devices and workloads. Just periodically sampling packets and hashing them is simple enough. The important thing is that there's a user-entered master secret of max entropy and the other entropy is combined with it safely. Let's not forget onboard generators like Intel's or VIA's which, when used as an untrusted added source, provide a nice boost for my design even if they're rigged to be non-random. That's because the strongest link verified by my own hands and eyes.

Last minute addition: wireless sniffers. Tapping into a bunch of people nearby's network activity would provide source of random data. There's legal and ethical issues with that. However, one can sniff WiFi traffic with ease. Moreover, sampling encrypted traffic like is common with personal WiFi would preserve those people's privacy. Further, if content itself was mixed in, the fact that it's already randomized by encryption should add nicely to the entropy. It's not *truly* random, but it will at least have a random distribution.

What do you all think of using WiFi sniffers in a neighborhood with plenty of encrypted AP's as an input source? And, btw, this is in context of my scheme where at least one highly verified method is used in the process and other entropy is to make job computationally harder. And where we are using software only solutions.

Nick PMarch 11, 2014 9:02 PM

@ Chris Abbott

He's not being mean. He's simply telling the truth. You're better off using the existing, well-analyzed ciphers for any practical project. There are existing CRNG constructions that survived years of analysis. I just use them and feed extra stuff into the seeding process to make attacker's job harder. Finding clever ways to put extra complexity for an attacker while using little extra CPU/memory and without breaking security proofs of your crypto functions will give you all the problem-solving fun you want. :)

AnuraMarch 12, 2014 1:20 AM

All that said, I find making block ciphers to be fun, personally. Make all the ciphers you want, just don't expect anyone to take them seriously.

I would never use any that I actually wrote myself. I have always said I would use my ciphers only so I can learn how to break them... I've been half-way through a number theory book for about 2 years, so you can see how that is going (I'm a community college dropout, so it's really a pain to get that deep into the theory - I'd rather program).

I am actually proud of how far I've evolved with my latest cipher (although key setup is really slow):
https://github.com/AnuraBufo/bufo

Every cipher I make, I end up learning where it can be improved, learning from my mistakes, etc. I'm sure a year from now, I'll look at that and say "Hey, I know what I could have done to make it better!" or (hopefully not) "Wow, that was a huge, glaring mistake!"

AnuraMarch 12, 2014 1:43 AM

Off-topic background information:

Originally, the core transformation used in that cipher (logically consisting of eight bijective 8x8 bit s-boxes, and eight injective 8x56 bit sboxes combined addition modulo 2^56, with 8 rotations, but implemented as a eight 8x64 bit s-boxes using addition modulo 2^64 and still 8 rotations) was designed for a somewhat memory-bound hash function called Rana, which was inspired by my last "genius" cipher idea, which used an 8x8 MDS matrix, and ended up being significantly slower than expected; I found out the poor performance was due entirely to memory access.

The idea of a memory-bound hash function was that it could be used along-side a key-stretching function to make it more expensive. I eventually scrapped the whole idea of relying on a memory-bound hash function in the first place when designing my KDF (which I do actually hope to get published... if I ever finish the paper). I decided to later make a cipher from the transformation for fun, which became Bufo and has been gathering cobwebs for the past few months.

AnuraMarch 12, 2014 2:20 AM

I remember my first cipher, it used xor, rotation, and bitwise-and operations only... It might have been strong (highly unlikely)... but it had a throughput of about 10Kbps.

WinterMarch 12, 2014 3:52 AM

Hardware entropy generator?

A cheap microphone directed at the fan of your computer will give you quite a number of bits of entropy.

Whitening is needed, as always.

Clive RobinsonMarch 12, 2014 7:10 AM

@ Matt,

why the heck isn't anyone talking about making random number generators from quantum noise? random.org used fm radio static into a sound card for years

Simple answer quick : They are more trouble than they are worth for ordinary usage.

The longer answer is to do with "Signal to Noise" in what is at the lowest level an analog signal. You have three parts to consider,

1, The physical source.
2, The amplification chain.
3, The sampling and digitisation system.

All of these parts are not closed systems and thus subject to various sources of external interferance, likewise they are not perfect so they distort the signal in amplitude and time and suffer from limited bandwidth.

Now consider the actual signal comming from the source it consists of three parts,

1, Predictable signal (offset and bias).
2, faux entropy (determanistic noise).
3, Real entropy (undetermanistic noise).

You are only interested in the real entropy and in general that is only a tiny fraction of the signal from the source.

Whilst the offset and bias can be predicted in both amplitude and time the instantanious values cannot, thus any correction system will result in error signal the magnitude of which may be comparable with the real entropy signal you are actually looking for.

The faux entropy signal is the real problem, whilst it may be determanistic, it may be undetermanistic to you but not an adversary attacking your system by injecting a signal know to them. Further it could be you don't sufficiently test your source sufficiently. For instance you might take a signal from a fan, you would expect it to have a rotational component based on the fan RPM, but consider it might be a DC fan which is sensitive to changes in the supply voltage these means your rotational signal is going to be frequency modulated by some function of the supply voltage. Likewise it is also going to be effected by the load on the fan blades. An attacker may be able to "add" a signal that you cannot detect which would either swamp or sink the real entropy out.

The problem you have is not in combating this (you can generate an antiphase or subtractive signal) but in identifing it. You have only finite resources and limited time and thus can only address a tiny fraction of the problem. Thus unless you take some quite extream precautions you could be attacked and not know it.

For instance you mention the use of an FM radio. Firstly are you aware of the effects of the radios "de-emphassis" circuit and IF bandwidth and any AFC circuit and how it responds to noise that might look like a valid stereo pilot tone and how to deal with them correctly? How about break through in the audio circuits from the stereo decoder circuit? Mains hum breakthrough in RF and Audio and what effect it has on the radio's local oscillator?

Then how do you know if mains bourn interferance is not getting into the radio front end or power supply unit or any other part of the radio? How about well out of band interferance from say a GSM mobile phone or two way radio or local Ham operator? How would you go about detecting it? How about adjacent channel interferance that comes and goes due to changing propergation effects?

All these have to be taken into account and firstly detected befor corrected...

Then an attacker can do all sorts of tricks over and above those which you would find difficult to detect even if you were aware of them. One such is to generate an artigicial noise signal and raise it to slightly above the background noise level but below any detector threashold you have (2-3dB up is usually sufficient). They could also at the same time direct an out of band signal of large amplitude at the radio that would "quieten" other noise sources. What you would see would be from your point of view the same as normal ie noise of a given amplitude, however the attacker is actually feeding a signal that is fully determanistic to them but not you...

If you have difficulty in beleving this a few years ago two researchers at the Uni of Cambridge Computer Labs obtained a "high quality" commercial True Random Noise Generator and hit it with an unmodulated microwave signal the result was to take it's output from 32 bits of entropy down to around 7bits that makes guessing attacks trivial. That is the effect of an out of band quietening signal. Now imagine what would have been the result if they had modulated it with synthetic noise. To the normal (mainly usless) tests the output would look perfectly valid. However when correlated to the synthetic noise source the corelation would be very high...

Such are the problems with TRNG's and mildly sophisticated attackers... I'm aware of other more sophisticated attacks that I developed back in the 1980's so I'm sure that many other more sophisticated directed attacks are possible and available to knowledgable and moderatly resourced attackers (ie less than 10,000USD of "home-brew" kit).

AnuraMarch 12, 2014 11:38 AM

Also, it should be said that you don't need special hardware to get truly random data: video, audio, mouse movements, time between key presses, performance statistics (cpu usage, IO, temperature, fan speed), and noise from other devices can be used to provide entropy. Some intel CPUs have the, highly criticized, RdRand generator built in. Some AMD CPUs also have hardware random number generators

The thing is that no single source of entropy should be trusted, so the solution is to take as many entropy sources as possible and run them through a cryptographic function so that even if a large number of sources are compromised, the system itself should remain secure. Ideally, even if quality sources dry up and you only have compromised sources remaining, then it still should not be exploitable to an outside attacker who is not aware of the state, provided at one point the RNG was seeded with sufficient entropy (and ideally the RNG would save its state so that the next time the system boots, the RNG is in a state that is unpredictable by anyone who does not have access to the machine).

Nice to see you mention the paper. To me, that builds additional trust… which – in these days – I regard to be the most valuable asset.

Thanks!

Mike AmlingMarch 12, 2014 6:04 PM

I'm only partially through the paper. In section 3.1, where it says "Correspondingly, such RNGs typically include some ad hoc entropy estimation procedure E whose goal is to block the RNG from outputting output value Rj until the state has not accumulated enough entropy gamma* (for some entropy threshold gamma*).", either I just don't understand this or 'until' is supposed to be 'if' (or maybe the 'not' should be omitted).

In Figure 5, in the proc. refresh(seed, S, I), is it intended that the

(Sout, R)<--nextout(seedG, Sout)
Srho<--Srho^R

steps be omitted when SC(skey, tau) returns upside-down-T as the value for out?

Mike AmlingMarch 12, 2014 6:40 PM

I'm surprised at Figure 8, namely the 2 in

ELSE out<--max{out: tau = 0 mod 2**out*P/wmax}

when I looked at the equivalent part of Fortuna, some time ago, my first thought was that the 2 was arbitrary, and that 3 or 4 (or 1.5) might be better. Now Dodis, Shamir, Stephens-Davidowitz and Wichs have agreed with Ferguson and Schneier, and put their imprimatur on 2. Ugh. It would probably take me weeks to work out a proof that 2 either is or isn't better than alternatives.

Mike AmlingMarch 13, 2014 1:43 PM

This is what happens when I don't first read the entire paper.

I wrote about 2**out in Figure 8 to an author of the paper, and they were kind enough to write back:

...
In the constant-rate regime, there's a very clear way to judge a scheme: You simply compute what we call the competitive ratio, r, which is just the ratio of the maximum entropy that we need to receive to be secure relative to an idealized scheme that knows everything about the system (e.g., how much entropy it's received, when it's been compromised, etc.). In this regime, our best scheme is about twice as efficient as the original Fortuna with the numbers that Ferguson and Schneier suggest, and it uses a base of 3. See Theorem 5, and in particular note the parameter b and the parametrized competitive ratio r_b. Intuitively, we achieve a competitive ratio proportional to roughly (b+1/b)/\log(b), and it turns out that that's minimized at b=3. (It happens that base 4 is almost as good, which might be convenient for practitioners.)

In the non-constant-rate regime the situation is a bit complicated. In particular, our security definition is slightly subtle, with two parameters, \alpha and \beta. If we receive some amount of entropy proportional to \alpha in time T, then we guarantee security at time \beta*T. Intuitively, once we've received sufficient entropy, we need to take some sufficient additional time to get that entropy into the register, the time that this takes is proportional to the amount of time that it took us to get the entropy in the first place. \alpha defines "sufficient entropy" and \beta defines "sufficient additional time" (multiplicatively).

The scheme from Figure 8 in base b achieves \beta = b^2 and \alpha proportional to 1/\log(b). So, in theory, there's just a tradeoff between \alpha and \beta and no real way to define an optimal base. However, note that in the constant-rate case, the competitive ratio r just corresponds to \alpha * \beta, so a natural thing to do is simply to extend this and minimize \alpha * \beta, which gives b = 2. (We can argue more rigorously that this parameter makes sense.) But, it's worth noting that these parameters necessarily come from a fairly extreme worst-case analysis that, in particular, assumes that our entropy source gives us just enough entropy and then stops providing any more. We could define a model in which we ask our entropy source to be "roughly constant" over "long" periods of time, and depending on the definitions of "roughly constant" and "long", b = 3 might give a better competitive ratio than b = 2.

Given all of this ambiguity and how complex the paper already was, we just fixed b = 2 in the non-constant-rate regime.
...

Just on instinct, I had used 3 in my adaptation of Fortuna to Keccak last year.

Another issue I was going to raise, namely that Figure 4 passes skey to attacker A but not to adversarial entropy supplier E, is addressed in section 7.

@Nick, Carl
It's just something I've done for fun and educational reasons. Something people could check out for the hell of it as a curiosity. Of course I would disclose that to people. I don't advocate anybody actually use it for anything important. I always use AES and Twofish for real projects.

Nick PMarch 13, 2014 2:19 PM

@ Chris

Sounds good. I'm all for people tinkering around, learning, and having fun. It's the very essence of hacking in the old meaning of the term.

Yeah, if anyone wants to see the source I'll put it on my website.

JoachimSMarch 14, 2014 4:00 AM

Anura: Just looked at the code at github. A few observations:

(1) HUGE s-boxes. You will probably be vulnerable to time based side channel attacks.

(2) Where does the constants in t.c and f.c come from? You should document that so anybody can replicate it. Even you. ;-)

AnuraMarch 14, 2014 4:48 AM

@JoachimS

I have all the code I used to generate them. The sboxes are 8 8x64 bit sboxes, which is about 4x the memory as you can expect in most software implementations and AES. And yeah, timing attacks are a possibility.

The sboxes in f.c were found via a pseudorandom search. I basically seeded xorshift with constants I copied from sha512 (entirely for convenience), randomly sorted an array with the numbers 0-255, and ran for a few weeks until I found sboxes with suitable cryptographic properties: nonlinearity 98, max value of 8 in xor table, and a generalization of the bit independence/avalanche/completeness criterion in which for every input differential and every combination of output bits, I check that the parity is 1 with a probability of 0.5 +/- 40/256 (extremely difficult to find good numbers for that property randomly, but for comparison AES sbox has a bias of 16/256).

The sboxes in t.c are computed from f.c; I'd have to check the code to generate it, but IIRC each byte is basically computed by nesting functions in f.c a number of times depending on the position, with a constant 0x00-0x07 xord depending on the position. You can see that the least significant byte for t[m][n] is f[m][n]-n.

As a person who has dabbled with hardware "true" random number generators, I "independently discovered" plenty of lessons about this topic.

One of the big items I learned is: Very simple whitening methods together with very biased inputs can give output streams that readily pass statistical randomness tests like dieharder (i.e., "my PRNG/TRNG passes DieHarder" doesn't mean much). Only analysis like we see in this paper can really give you confidence that your outputs are unguessable by an adversary with specific capabilities.

So while the math is way over my head I say: I hope that my favorite operating systems will take lessons from it as soon as possible to improve their cryptographic-quality random number generation.

Rogelio m. Serrano Jr.September 5, 2014 6:26 PM

In the paper the authors mentioned an optimisation. What is it exactly? The pool to be filled is chosen randomly and then also emptied randomly and a random permutation of the pools are emptied every 32nd reseed?