Random Number Generators
NIST has just published “Recommendation for Random Number Generation Using Deterministic Random Bit Generators.”
NIST has just published “Recommendation for Random Number Generation Using Deterministic Random Bit Generators.”
gremlins. worked fine for me.
aikimark • June 14, 2006 9:01 AM
got it to open now. thanks.
Devang • June 14, 2006 9:44 AM
I like the idea, they even covered auxilary routines… how nice.
sidelobe • June 14, 2006 12:29 PM
I’m missing something. This provides guidelines for using, and presumably building and testing, DRBGs. Though, it states that the entropy input shall, at the top level, be truly random. It also states that the entropy shall be kept secret. Doesn’t this make the output truly random rather than deterministic?
So, can this be reduced to say: “use real random numbers” ?
RvnPhnx • June 14, 2006 12:54 PM
@sidelobe
No, it doesn’t make the output truly random as it is the deterministic result of the input–which while it should be truly random in an ideal world, very often is not.
Truly random data is far more difficult to come by than it may sound, so the point of a random number generator is to take psuedo-random input and obfuscate the non-randomness in the input as much as possible–or more aptly, as reasonable.
Anonymous • June 14, 2006 1:23 PM
“No, it doesn’t make the output truly random as it is the deterministic result of the input–which while it should be truly random in an ideal world, very often is not.”
If you can generate truly random bits from some source, then manipulating them is not going to suddenly make them pseudorandom. On the other hand, there’s a real danger of decreasing the entropy that you get from your input, or of outputting more bits than you get from your random input, in which case the entropy of your output is at most the entropy of your input. In the latter case, I suppose the output wouldn’t be completely random, but could it be called pseudorandom? Or is there a better term?
If your truly random source can only generate three bits a second, for instance, then coming up with a random 32-bit number is going to take 11 seconds. Maybe the paper is addressing the best ways to use the truly random source to add salt to a PNRG?
geoff lane • June 14, 2006 1:31 PM
“Anyone who attempts to generate random numbers by deterministic means is, of course, living in a state of sin.” — John von Neumann
RvnPhnx • June 14, 2006 1:54 PM
@geoff lane Thank-you for that!
@Anonymous
“If you can generate truly random bits from some source, then manipulating them is not going to suddenly make them pseudorandom.”
First off, the term “pseudorandom” is used to describe ANY set which appears random when looked at over a some small interval, but is not truly random.
In fact, as just about any good mathematician out there would probably be willing to blow your mind proving (or any halfway decent physicist could explain), there aren’t a lot of good truly random sources out there–in fact, for it to be truly random in the “coloquial” sense it would need to be unmeasurable at least some part of the time. Therefore, most things out there termed to be random are, at best, pseudorandom. I’m sure that there is a lot better explanation out there, but I don’t teach abstract mathematics or physics.
Next, it has already been shown that many deterministic processees, when fed “random” input still generate predictable results. Therefore, they may still be outputting psuedorandom data, but that data is going to be bounded by the limits of the function(s) used to process that data–in other words, it isn’t truly random.
Now, if you wished to say something like “a random integer bonded by zero and (2^32)-1” and call that “random” for the sake of argument, then we would be in agreement for all intents and purposes (because it would be, by definition pseudoradom in total, but random enough for discussion).
Christoph Zurnieden • June 14, 2006 3:47 PM
I took a fast glimpse and think it has some quite usefull “recipes” in it for the implementation of some varieties of PRNG with a truly random seed.
It needs some more proofreading nonetheless: the mainly informal language (that includes the pseudocode) used is too imprecise sometimes.
Example: the description “the time needed […] is not a fixed length” in appendices B 5.1.1 and 5.1.2 is a bit misleading: the program will not halt an infinite number of times with a probability above zero if $r < +\infty$.
I can’t base a contract let alone give/demand a warranty with such weak descriptions but it’s a recommendation for federal agencies?
CZ
sudo random • June 14, 2006 4:00 PM
@ RvnPhnx
…obfuscate the non-randomness…
I find it easier to think of a PRNG as diffusing or distributing the available randomness (i.e. the truly random input bits) across a larger number of output bits. Then you can evaluate the quality of the PRNG at diffusing the randomness, independent of the quality of the input randomness.
For example, a crappy PRNG would be to OR 1’s onto the input bits, producing output of all 1’s. The diffusion quality of the PRNG is abysmal, so the system as a whole has a completely non-random output, despite having truly random input.
heresone • June 14, 2006 4:32 PM
27 … I have more where that came from.
Anothernonymous • June 14, 2006 6:59 PM
…27…
It’s a well-known fact that odd numbers are more random than even numbers.
Stefan Wagner • June 14, 2006 7:44 PM
@ anothernonymous:
It’s a well-known fact that odd numbers are more random than even numbers.
🙂 Great!
@heresone:
… 0 … – oh – let’s drop this one, it doesn’t look very randomlike…
Anonymous • June 14, 2006 9:36 PM
RvnPhnx:
“First off, the term “pseudorandom” is used to describe ANY set which appears random when looked at over a some small interval, but is not truly random.”
Say you have PNRG with a 64-bit state, a known starting seed plus a truly random input. (They may be scarce, but they do exist.) Every 1000 or so bits it outputs, it replaces its current state with a value from the random input. So the first 1000 bits will be completely known, and pseudorandom. The second 1000? You can use the first 20 or so bits from the second thousand to infer what the random input was and determine the rest of the second 1000. (A good PNRG would force that determination to involve factoring large numbers or something, and so infeasible to actually do.) The second 1000 bits still only has as much entropy as the truly random input, i.e. 20 bits. So it’s still pretty predictable (in theory), besides those 20 bits every 1000 or so bits. But those 20 bits, munged by the PNRG as the might be, are still completely unpredictable, still completely random, and not pseudorandom.
Anonymous • June 14, 2006 9:36 PM
I’m sorry, replace “20 bits” with “64 bits”.
Filias Cupio • June 14, 2006 10:36 PM
A (non-security/crypto) tale of PRNGs:
When I was studying astronomy, a curious result was published: a very narrow (small area of sky), deep (includes very dim galaxies) survey of galaxy red-shifts had been done. (Red-shift corresponds to velocity, which due to expansion of the universe corresponds closely to distance.) The red shifts showed significant periodicity. (I.e. at regular intervals in red shift, there were more or fewer galaxies found.)
One of my professors had been doing large computer simulations of large scale structure in the universe. He said “I know what causes this. I’ve seen it in my simulations. God used a bad random number generator.”
geoff lane • June 15, 2006 12:45 AM
I once used a computer that contained a true random number generator (derived from the quantum noise in a special diode.)
I wonder why the same or similar technique can’t be used in modern CPUs?
Hardware random numbers generators are as hard to do right as digital random number generators, plus they tend to only output bits at a low data rate.
Paul Renault • June 15, 2006 6:24 AM
“I wonder why the same or similar technique can’t be used in modern CPUs?”
Darn, you beat me to it: There IS a hardware RNG built-in to the P4 (and other, I’m sure) CPU. If memory serves, it uses the difference in voltage between two heat-sensitive resistors to generate random numbers (available as 32-bit, 64-bit, 96-bit, or 128-bit).
I had come across an NSA document a few years ago that stated that the built-in P4 RNG could be trusted as a true RNG. I can’t find it now. Dang!
Jungsonn • June 15, 2006 8:16 AM
Isn’t it a fact that randomness just doesn’t exist? i have trouble to understand “true” randomness, because if you know all variables everything can be predicted i guess? Like if you throw a dice, you can calculate all variables in the field that influences the dice, heat, drag, weights, etc.. Or is this not true?
Doe someone have more information about this?
serrano • June 15, 2006 9:30 AM
Definitions from the link…
[ Random Number ] For the purposes of this Recommendation, a value in a set that has an equal probability of being selected from the total population of possibilities and, hence, is unpredictable. A random number is an instance of an unbiased random variable, that is, the output produced by a uniformly distributed random process.
[ Pseudorandom ] A process (or data produced by a process) is said to be pseudorandom when the outcome is deterministic, yet also effectively random as long as the internal action of the process is hidden from observation. For cryptographic purposes, “effectively??? means “within the limits of the intended cryptographic strength.???
These definitions can correspond to different perspectives of the same object.
Pseudo is an informed (internal) view. Random is an uninformed (external) view. “Randomness just doesn’t exist” supposes that there is an informed view for every process.
After the deal, the contents of your poker hand are random as far as I am concerned. But they are determined for you.
Deterministic random numbers are the past:
see
Fleagle • June 15, 2006 9:38 AM
@Jungsonn:
At a low enough (quantum) scale, there definitely is true randomness, and not of the “hidden variable” type, but guaranteed to be unpredictable. Actually gathering bits from this is difficult, but possible – the typical technique is to use thermal noise from a resistor – you supply an absolutely constant voltage, then measure small variations in the resistor’s heat output. Because the variations in heat output are very small, and the sensors have a minimum threshold for the temperature changes they can detect, this tends to be a slow to generate entropy bits, but it does generate real randomness.
In the physics community, random numbers coming from thermal noise are not seen as true random numbers as the numbers do not come from a quantum process.
Some practical examples of true random numbers from quantum theory are:
the reflection – transmission of a photon in a semi-transparent mirror (see id Quantique website)
the timing of successive pairs of radioactive decays (see http://www.fourmilab.ch/hotbits)
Wim L • June 15, 2006 11:46 AM
@jd: isn’t shot noise (one of several sources of electrical noise) considered truly random, though with some temporal correlations depending on the device?
Thermal noise is easier to use (I built a thermal noise RNG once, just for the heck of it: about 600 kb/s output, containing > 40 kb/s of entropy by my estimate; very simple to build) but I see the argument that it’s somewhat deterministic.
Christoph Zurnieden • June 15, 2006 12:51 PM
@RvnPhnx
If you prefer a more practical, hands-on approach:
Take two decks of cards (preferable with different backsides) and throw one of it several times on the floor such that the cardfaces are randomly up or down (you may call face = 1, back = 0), the other deck is left untouched.
The mixed deck “M” is the seed, the untouched deck “U” is the number generator with the very simple algorithm $x=x$, it does nothing so to say.
The seed algorithm is only slightly more complicated: mix both decks by alternately taking one card form deck M and one card from deck U starting with M forming deck “R”. It’s easy to see that every second card must show the backside. It is not so easy to see that by looking at a small part “S” of R:
M=BBFBBFFBFBBBBFFFBBBBBBFFBBBBFFBBFBFBFFBBFFFFFBFBBFFF[1]
U=BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
R=BBBBFBBBBBFBFBBBFBBBBBBBBBFBFBFBBBBBBBBBBBBBFBFBBBBBBBBBFBFBBBBBFBBBFBBBFBFBBBBBFBFBFBFBFBBBFBBBBBFBF
S=BBFBF
R shows its backsides significantely more often than its faces and you can easily predict every second card. The goal of a PRNG is to reduce this significance and ease of predictability for large amounts of output.
So the deck U needs some mix-up, no random mixing but an algorithm. I’ll use the digits of pi, so I’ll turn the first card of U face up and put it at the bottom of the staple, leave the orientation of the next four cards untouched but put them on the bottom too, turn the next card around, put it at the bottom, leave one untouched but put it at the bottom, turn the next card around and put it at the bottom, leave the next five cards … and so on, I guess you got the rule. Take at least as much steps/digits of pi as there are cards in the deck. Mix it with the random deck as described above and you have a much more random looking deck R.
I used the full seed every time now but “entropy is precious”: my true random number generator has an average output of a mere 30kbit/sec and it’s probably quite arduous to pick a full deck of cards from the floor every time you need some dozen bits of true random[2]. So we take only the first 8 cards of the true random deck “BBFBBFFB”, take it as a number base-2 (face = 0, back = 1) “11011001” wich is decimal 217[3], take that number as the offset in pi and start at digit #217 with the mixture of U as described above but without mixing in the deck M. When you are through all cards in the deck U the entropy of the eight cards of deck M is “consumed” and you take the next eight cards from M with a fresh deck $U_{n+1}$ until you can’t get the needed 8 anymore.
You can change all of the above variables, e.g. use more or less than 8 cards from M for the seed, use longer or shorter intervals for using a new seed than the number of cards in U, use a more sophisticated algorithm than the use of the digits of pi and so on.
Finding the right values for these variables is quite difficult and the NIST-paper is exactly about that: recommending values for all of these variables to get a cryptographically secure stream of pseudo random numbers and describing all of the pros and cons because there are no single optimal values.
Another small experiment:
Choose two different digital english texts of the same size[4], measure the entropy[5] individually, add both bitwise modulo 2 (aka XOR[6]) and measure the entropy again. Do the same with the output of some PRNGs[7] and with some true random[8] too. You might be surprised by the outcome.
CZ
[1]dd if=/dev/random bs=512 count=100 |tr -d ‘\0’|sed -e ‘s/[^FB]*//g’|tr -d ‘\n’|head -c52
/dev/random is true random (radioactive decay of americium) on my machine.
[2] but it might be a good method fighting the obesity epidemic in the west.
[3] perl -we ‘sub bin2dec{unpack(“N”, pack(“B32”, substr(“0” x 32 . shift, -32)));}$d=bin2dec(“0b11011001”);print “$d\n”
[4] http://gutenberg.org might be helpfull
[5] there exist a lot of different tools, see for example http://www.fourmilab.ch/random/ for a copy of ‘ent’
[6] use perl (e.g. perl -we ‘open(A,”file1″)and open(B,”file2″)or die(“open”);while(1){read(A,$one,1) and read(B,$two,1) or die(“read”); $three=$one^$two;print $three }’>xored), use “Monolith” introduced a couple of weeks ago at this place, search Google or ask me (I use the host “gmx.de” and my username there is ‘czurnieden’) for some lines of standard C.
[7] ask me (adress see above) for a couple of bad ones in standard C or search Google
[8] see http://random.org/files for a source of some truely random bytes
MyCat • June 15, 2006 1:04 PM
SGI had the lavarand RNG which consisted of a set of lava lamps and a digital camera. Grabbing a picture and hashing it gave some random bits. Then one day someone left the lens cap on, and the hashed thermal noise was just as random…
Jungsonn • June 15, 2006 1:57 PM
Ok i’ll take a shot.
Really like to know if this method i thought of is producing possible random bits:
I place a microphone outside my window, that catches all sounds from outside. The mic is adjusted to pick up a broad range and sound comming from far distance. and run the waves as gate through a simple transistor based flipflop circuit which can produce 16 bits.
This would produce random bits?
Aleks • June 15, 2006 10:58 PM
@ Jungsonn
You can make predictions about the sounds outside.
At night it’s quiet, at rush hour its louder. Perhaps sunday morning carries a different noise profile from church bells, or whatever. So in that sense, this signal over a large enough timeframe has features.
Remember, you should not be able to pick out ANY statistical features from your “randomness”.
tomz • June 16, 2006 2:16 AM
Hi everyone,
Do you have some tips to make tests (maths, statistics) to determine if you have a good pseudo random generator or a bad one ?
Thanks
Dror • June 16, 2006 8:57 AM
Hi,
If any of you are interested in the true definition of pseudorandomness, there is a great article by Prof. Oded Goldreich from Weizmann Institute (author of “Foundations of Cryptography”):
http://www.wisdom.weizmann.ac.il/~oded/PS/ln00a.ps
Other papers are available too under http://www.wisdom.weizmann.ac.il/~oded/pp_pseudo.html
JRP • June 16, 2006 3:36 PM
@MyCat
Then one day someone left the lens cap on, and the hashed thermal noise was just as random…
I wonder if that version would be covered by their patent:
http://www.freepatentsonline.com/5732138.html
If ANY digitized chaotic source falls under the patent’s claims, then digitizing random diode noise, thermal resistor noise, the velocity of a mouse, or the traffic sounds outside one’s window would seem to fall under the patent. That seems insane, but IANAPL.
@jd:
No, pseudorandom numbers are extremely useful even if truly random numbers are available. The approximation quality of quasirandom numbers for numerical integration can be shown to be better than truly random samples. (Quasirandom has a slightly different meaning than pseudorandom, implying some guarantuees on the set of output numbers, but the concepts and even the algorithms are often the same.)
Thus, Monte-Carlo simulations often explicitly use quasi-randon number sequences instead of truly random numbers, even if those might be available. This complicates matters for the implementer, but that time is well spent.
Subscribe to comments on this entry
Sidebar photo of Bruce Schneier by Joe MacInnis.
aikimark • June 14, 2006 8:34 AM
Is any else having trouble reading the PDF or is it just another gremlin in my PC?