Schneier on Security
A blog covering security and security technology.
« Maps Showing Spread of ZeroAccess Botnet |
| Friday Squid Blogging: Giant Squid Engraving from the 1870s »
October 5, 2012
When Will We See Collisions for SHA-1?
On a NIST-sponsored hash function mailing list, Jesse Walker (from Intel; also a member of the Skein team) did some back-of-the-envelope calculations to estimate how long it will be before we see a practical collision attack against SHA-1. I'm reprinting his analysis here, so it reaches a broader audience.
According to E-BASH, the cost of one block of a SHA-1 operation on already deployed commodity microprocessors is about 214 cycles. If Stevens' attack of 260 SHA-1 operations serves as the baseline, then finding a collision costs about 214 * 260 ~ 274 cycles.
A core today provides about 231 cycles/sec; the state of the art is 8 = 23 cores per processor for a total of 23 * 231 = 234 cycles/sec. A server typically has 4 processors, increasing the total to 22 * 234 = 236 cycles/sec. Since there are about 225 sec/year, this means one server delivers about 225 * 236 = 261 cycles per year, which we can call a "server year."
There is ample evidence that Moore's law will continue through the mid 2020s. Hence the number of doublings in processor power we can expect between now and 2021 is:
3/1.5 = 2 times by 2015 (3 = 2015 - 2012)
6/1.5 = 4 times by 2018 (6 = 2018 - 2012)
9/1.5 = 6 times by 2021 (9 = 2021 - 2012)
So a commodity server year should be about:
261 cycles/year in 2012
22 * 261 = 263 cycles/year by 2015
24 * 261 = 265 cycles/year by 2018
26 * 261 = 267 cycles/year by 2021
Therefore, on commodity hardware, Stevens' attack should cost approximately:
274 / 261 = 213 server years in 2012
274 / 263 = 211 server years by 2015
274 / 265 = 29 server years by 2018
274 / 267 = 27 server years by 2021
Today Amazon rents compute time on commodity servers for about $0.04 / hour ~ $350 /year. Assume compute rental fees remain fixed while server capacity keeps pace with Moore's law. Then, since log2(350) ~ 8.4 the cost of the attack will be approximately:
213 * 28.4 = 221.4 ~ $2.77M in 2012
211 * 28.4 = 219.4 ~ $700K by 2015
29 * 28.4 = 217.4 ~ $173K by 2018
27 * 28.4 = 215.4 ~ $43K by 2021
A collision attack is therefore well within the range of what an organized crime syndicate can practically budget by 2018, and a university research project by 2021.
Since this argument only takes into account commodity hardware and not instruction set improvements (e.g., ARM 8 specifies a SHA-1 instruction), other commodity computing devices with even greater processing power (e.g., GPUs), and custom hardware, the need to transition from SHA-1 for collision resistance functions is probably more urgent than this back-of-the-envelope analysis suggests.
Any increase in the number of cores per CPU, or the number of CPUs per server, also affects these calculations. Also, any improvements in cryptanalysis will further reduce the complexity of this attack.
The point is that we in the community need to start the migration away from SHA-1 and to SHA-2/SHA-3 now.
Posted on October 5, 2012 at 1:24 PM
• 48 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
To find the collision, don't you have to perform 2^60 20 byte compares? How much does the Amazon storage to hold all that data cost, and doesn't the disk bandwidth affect the performance?
The analysis seems reasonable aside from the Amazon tie in. The commodity server that Amazon rents for that price is an order of magnitude different than the commodity server specified, likely offsetting the cost by an order of magnitude.
But for one final twist, a criminal syndicate would simply use a botnet with vastly different economics than a commercial compute service.
"Assume compute rental fees remain fixed while server capacity keeps pace with Moore's law."
Anybody who believes this assumption has zero experience with Amazon pricing model.
Amazon prices fall very slowly, much slower than Moore's law, and they increase very steeply for better hardware.
More realistic attack would be a few grad students and a lot of GPUs.
Am I correctly understand that 2^14 is per 64 bytes block? I believe this might be over-pessimistic.
According to OpenSSL 1.0.1 source codes, SHA-1 (AVX) is 4.6 cycles/byte on the latest publicly available Ivy Bridge CPU, or ~2^8 clocks per block. So presented estimation might be way worse.
But the question is - does it make sense to switch to SHA-2+ and spend 2-3x more time and power consumption on hashing for things like gmail, skype, online gaming and so on. I believe such information usually doesn't worth $50K, and if it does people usually understand it and use more secure channels.
That must have been a huge napkin!
log2($64k/$1k) = 9 years
Will the thing protected be worth $1k a decade from now?
The main issue is future cartographic advances. I can assume my data is worth say $20k and conclude SHA-1 is good enough for the next few years. But when will the next cryptanalytic improvement be? A 1000 fold improvement could pop up tomorrow morning (or in a couple of years)
but it is very likely the best attack won't remain the same. and that is what is difficult to predict. SHA-1 gives us no safety margin.
If you want to see what a system to efficiently compute SHA(256), you should look to bitcoin farmers. From what I understand, they used AMD5830s (now 2 generations out of date) if at all available.
According to this site:https://en.bitcoin.it/wiki/Mining_hardware_comparison
~$300 AMD 7950 boards pull 500 hashes/s.
Top of the line Intel CPUs pull no more than 66 hashes/s (the odd behavior due to compiler switches does not inspire confidence either. Don't count on more than 40 hashs/s even with 6 cores active).
Nvidia does not appear to have bothered adding the missing rotate(?) instruction that rumor claims is killing their performance. Expect significantly better than a CPU, but less than 1/4 that of an equally priced AMD board.
I can't argue that SHA256 performance will match SHA-1, but it would be a good place to start. I also can't believe that there aren't real bitcoin fans following this site.
Yup @wumpus. Crackers get huge speedups that everyday users don't, like computing several hashes in parallel on a GPU or special hardware.
Then there's the potential for new cryptanalysis. And it'll take years to migrate a chunk of the world's computers, embedded systems(!), and hardware(!) to a new algorithm.
If we don't at least start making easy changes now, SHA1 will just follow MD5's trajectory many years delayed.
(I want to see randomized hashing used wherever possible, too, so we can all worry about collisions much less, but that's another story.)
Another bitcoin fan here...
SHA1 is 160bits, and is simpler than SHA2 (SHA256).
Since it uses a subset of the operations of SHA2, it should be able to run on AMD GPUs even faster than SHA256.
Hmm... SHA1 is 80 rounds, vs 64 in SHA2.
Not sure which would be faster, considering less bits and fewer instructions used in SHA1, but I'd guess SHA1 would still be faster.
ELI5: What does finding a collision get you? How do you make hundreds of thousands of dollars off finding one?
Let's say you want to perpetrate some financial fraud via a MITM attack (or just a plain old fashioned con). You might need some fake document that passes a hashing test for authenticity.
If doing high stakes fraud, you are probably looking at a $3-$5 million minimum target.
@BJ SHA-1 considerably faster than SHA-256 (at least on Intel CPUs). 4.6 cycles/byte for SHA-1 vs. 10.3 for SHA-256.
The Bitcoin comparison suggests the numbers in the original post could dramatically underestimate modern processing power and that a SHA-1 collision could definitely be within reach.
Graphics cards were the state-of-the-art in Bitcoin mining a while ago, but people are moving towards FPGA and even ASIC solutions now. And while they aren't super widely deployed yet (particularly ASIC chips), they do provide an impressive benchmark.
A particular standalone FPGA based miner costs $600 and can do 800MHash/s of Bitcoin's double SHA-256 operation. Ignoring hash speed differences, that's close to 2^30 hash operations per second or 2^55 per year. 32 of those, or about $19k worth, could produce a SHA-1 collision in about a year (assuming 2^60 as a ballpark for collisions).
At the more bleeding edge of the scale, a soon to be available ASIC setup can allegedly perform 30GHash/s for about $650. That's about 2^35 hash operations per second or the magical 2^60 per year. With the same $19k worth of those, a SHA-1 collision could be found in less than 2 weeks. A larger organization with something like $5 million to play with (so about 8,000 of those ASIC devices) could produce collisions in about 1 hour.
It's also worth noting that everyone in the Bitcoin network is currently doing about 2^44 hash operations per second total. The network could produce a SHA-1 collision at that rate in about 18 hours.
Obviously those numbers should be taken with a big grain of salt. The collision producing procedure may not be quite a easy to optimize as Bitcoin hashing speed. And in the case of the ASICs, it remains to be seen if they hit their advertised speeds for that price point. And the ASICs and FPGAs are specifically designed for Bitcoin hashing, producing SHA-1 collisions with them directly would not be possible...a different piece of hardware would be needed.
But clearly a SHA-1 collision is within the realm of computing possibility, possibly relatively easily by a reasonably well financed attacker. The fact that anyone can do 2^60 of ANY hash operation in a year with a device that costs $650 definitely suggests SHA-1's days are seriously numbered.
... and doesn't the disk bandwidth affect the performance?
Yes and quite a lot more than people realise, but the usual bandwidth figures quoted in manufactures data sheets are very misleading.
This is becausee there are various steps to retrieving data off of a hard disk and these all add delays which have their own very non linear side effects.
Also just as importantly is the knowledge of how the overall system has been optimized for performance in accessing the hard drives because usually it's optomised for sequential reads of data not random access to a couple of bytes of data.
Only when you know this information can you work out an effective method of storing the data either for a single user access or for multi simultaneous user access (and the two are vastly different).
Oh and of course the question people rarely ask when talking about large data sets for crypto cracking which is "How long will it take to populate the full data set to start with?"
There are a couple of types of comments posted that require a response
The first are the ones that talk about the pricing model. My response is they are fair, and if someone wants to propose a different model then I have to say "Well done," and the community needs to be grateful for any improvements. There are many ways my model could be wrong, so fix mistakes where you find them. What does not earn my gratitude would be an interminable discussion about whether or not to migrate without any reference to any model or any data.
The second kind of comment requiring comment is grumbling about the performance of SHA-2 or SHA-3. It is unfortunate that the SHA family is getting slower with each generation, but if we have learned anything from the competition it is that building a good hash -- one that actually meets its functional requirements -- is a lot harder than building a good block cipher. Continuing to use SHA-1 for the collision resistance property that it is known not to possess will eventually cause problems for implementers and for their customers -- stop doing it and begin a transition as soon as possible.
You really do need to factor in the GP/GPU element into this.
Most if not all password cracking and similar tasks are currently done with GP/GPU based setup's which are far more capable that CPU based systems.
One example is here:
I remember many years ago, there was this US company named Starbridge Systems (or something similar, or maybe Star Bridge Systems if its the same company) that had this peculiar mission statement on their webpage, something about the desire to present affordable supercomputers that had size and the power requirement of a desktop.
Some time later, that mission statement was nowhere to be seen. Apparently, things didn't work out that well. Maybe things have changed since then, or maybe not.
0.04/hour price is for 1 core, not 32.
I agree, the facts are definitely the facts when it comes to hash security...complaining about the decrease in speed for secure hash algorithms gets us nowhere. If we need collision resistance, complaining about how fast MD5 and SHA1 are isn't very helpful.
I think a big part of the answer is to broaden the selection of widely supported cryptographic algorithms so people don't have to use primitives that are slow because they provide properties the user doesn't need. Collision resistance in a keyed MAC function, for example. Saving the slower (but secure) SHA-2/SHA-3 for when necessary could help quiet speed complaints.
Bruce suggested in another post that NIST do a fast stream cipher competition next. I'd love to see a fast MAC function competition alongside it.
I don't understand why we can't have multiple machines executing and finding the collision? We certainly can build a rainbow table for SHA-1 and expect it to be complete in a year with enough computers and hard disk, right?
"If you want to see what a system to efficiently compute SHA(256), you should look to bitcoin farmers. From what I understand, they used AMD5830s (now 2 generations out of date) if at all available.
According to this site:
"~$300 AMD 7950 boards pull 500 hashes/s.
"Top of the line Intel CPUs pull no more than 66 hashes/s ... "
According to your link your numbers are off by a factor of 1,000,000.
Where you say 500 hashes/s the link gives 500 Mhash/s.
I take Mhash to mean million hashes.
Any increase in the number of cores per CPU, or the number of CPUs per server, also affects these calculations.
I don't agree with this statement. You need the number of cores of current computers to calculate current computing-speed, but Moore takes the number of cores into account.
The reason normal desktop computers now have 2 or 4 cores is to keep up with Moore's law. The amount of transistors that the process guys can put on a die goes way beyond what the processor guys can use in one processor. So after bigger caches they started putting in more cores.
"Where you say 500 hashes/s the link gives 500 Mhash/s.
I take Mhash to mean million hashes."
The point was to give the ratio between GPU/CPU. Since they are for SHA256, absolute numbers aren't meaningful for SHA-1.
The real catch is that pulling 500MHashes/sec spits out data at 16GB/s (or 128Gb/s). While hard drive manufacturers love to quote the speed of the interface (6Gb/s or less), a just released consumer SSD just hit 4Gb/s but poor googling implies that cost effective (spinning) hard drives tend to run under 1Gb/s.
It looks like one of those "slack until the technology catches up" problems. I don't want to know the cost of 160*(2^80)bits of storage that can store data at 128Gb/s
With various thoroughly tested 512-bit hashing functions being available nowadays, would it be safe using a combined hash function like this?
Überhash(input) = Skein(Sha2(input)+input) XOR Keccak(Whirlpool(input)+input)
Where + is binary concatenation, i.e. like prepending a salt.
... would it be safe using a combined hash function like this?
I won't comment on your specific example, however as a general case chaining encryption functions together (in the right way) will increase the security margin. However it is both theoreticaly and practicaly possible to weaken a chained function which is why you should select encryption functions who's internal functions are in effect orthagonal to each other.
If you have a look at IDEA you will see the Lai-Massey scheme that uses three broadly orthagonal methods of XOR, ADD, MUL in fields, a later cipher called FOX was based on the same idea and is now known as IDEA-NXT and claims it is proof against both Linear and Differential cryptographic attacks, as well as the difficulty of trying to make an algebraic model that could simplify the three orthagonal functions.
@Sheldon: "Überhash(input) = Skein(Sha2(input)+input) XOR Keccak(Whirlpool(input)+input)"
I'm not exactly sure why you put in the arbitrary separation into two halves that are then simply XORed (parallelization?). You might just want to go for:
Überhash(input) = Skein(Sha-512(Keccak(Whirlpool(input)+input)+input)+input)
Ben wrote: "Let's say you want to perpetrate some financial fraud via a MITM attack (or just a plain old fashioned con). You might need some fake document that passes a hashing test for authenticity."
But that's not a 2^60 birthday attack, where any two matches win. If you want to forge a particular existing signature, you're back to 2^120 or so. Plus, finding any random string that collides isn't enough. It has to look like a fake document, and the required structure severely restricts your search space. In fact, no collision may exist.
@Paeniteo: I thought the separation would make 100% sure that any potential future weaknesses in either Skein or Keccak would have zero influence in the combined result. But I guess your way of nesting them effectively has the same property.
I wouldn't be surprised that we will see SHA-1 broken more sooner. My bet is in 1 year SHA-1 will follow MD5, and by looking at the recent trends it will not be done by a criminal organization, but by a government. 2.7 Mil in "cyber-security" certainly is not an obstacle for any government.
There should be no problem achieving 128 Gb/s speeds with an not-so advanced storage cluster. A 30-40 node cluster with Infiniband and SSD's should easily do it. Is the actual size of data to store 160*(2^80)bits? Seems too high for me.
Of course, one could say the same about AES. According the Bamford's famous Wired article a number of months back, NSA has already made "huge breakthroughs" in the cryptanalysis of publicly used algorithms. Whether this means AES or public-key ciphers, he didn't say (though he did imply it was AES).
He did say that the new Utah data center is crucial to making these breaks practical. They do this, according to him, by what sounds like a ciphertext-only attack. They compare an ungodly number of ciphertexts with each other looking for "telltale patterns." AES is supposed to be immune to that type of attack, but apparently they have some method of making it work.
Of course, Bamford may have been mislead or the officials, not being cryptographers themselves, may have had the details wrong. In either case, I am certain AES doesn't stop them. If it did, our whole intelligence gathering apparatus would go dark, and it obviously hasn't.
To me, it seems simpler to crack RSA. If you do that, then AES becomes moot since most every message sent over the wire will have its AES key encrypted by RSA. So, break RSA and you have all the AES keys. It seems much more likely to me that they have found some practical method of factoring large integers into primes.
I suspect Clive could give more detail on whether such an attack as outlined in the Bamford article is practical (assuming NSA can house a yottabyte of data and utilize the fastest supercomputers).
So, break RSA and you have all the AES keys. It seems much more likely to me that they have found some practical method of factoring large integers into primes
The NSA don't need to do to much on factoring large integers into primes to crack RSA.
All they realy need to do is know the random generator used to generate the two primes, and if it is one of the majority that is weak...
As was discussed on this blog a little while ago there is a fairly quick and simple test to show if two or more RSA PK certificates share either of the PQ primes. And a couple of researchers showed that indead a surprising number (ie many) PK certs visable on the internet did indead share primes. This could only be down to the use of weak random number generators.
Now if you know many certs share a prime it would be worth factoring one of the certs as this would enable you to quickly break the other certs.
But how would you factor such a cert, well you don't have to do that much work due to the weak random number generator.
If you have analysed the generator you will know what it's likely outputs are and thus your search space for factors is actually quite small.
This is because with many software and hardware packages the PQ pair are generated very early in the packages use when the entropy in the random number generator is actually very small.
Now from the NSA's point of view they are in general not interested in factoring a specific cert just in factoring as many certs as possible and recovering plaintext to go into the DB.
This is due to several things but two stand head and sholders above the others,
1) To recover the plaintext of a specific message you need to break one certificate. But if you know the coresponding parties you only need to crack one of the two certs to get usable plain text,
2) Which may well due to human failings provide the text of the earlier message within it, due to the fact that with EMail and similar systems hitting "reply" includes the message you are replying to as well as your reply.
Thus although one party may have a very strong random number generator and their cert cannot be easily broken, the party they are communicating with may have a very weak random number generator with a certificate that was very easily broken. As the party with the strong certificate you probably cannot tell if the party you are communicating with has a weak and broken certificate which is alowing your messages to be read because of the "reply" function...
Whilst you can fairly easily generate strong random numbers using dice or other physical objects you tend to get only a few bits for each throw. That is two throws of a dice gives you a number in a range of 36, which for ease of use would be limited to 32 giving 5bits for two throws. These days you need around 2000bits to get a strong PK cert which would be 400 throws of a dice which would take the better part of a morning to do. The random number generator in a lightly loaded Personal Computer could take almost as long to build up the same number of bits of entropy.
In general people don't want to wait to get good entropy when they can have little or no entropy hashed up to look like good entropy. But the reality is it's "magic pixie dust" thinking and the likes of the NSA know this and how to exploit it.
To folks asking if you can combine hash functions: yes. TLS 1.2 uses concat(MD5, SHA1). The Wikipedia article on cryptographic hashes has a section on exactly what that gets you security-wise.
But unlike with MD5 and SHA1, now there's enough margin and study that _practical_ breaks of SHA-512 or Keccak seem unlikely. Also, if you're designing a new app, just try and avoid the need for collision resistance: never sign anything containing attacker-controlled data without prepending a random string to it.
As always, once you quit using the crap algorithms, bugs and design flaws quickly become more of an issue than algorithmic strength. It's something we nerds forget that's basically been the focus of Bruce's work for years, and much more important than a new super-nifty new hash.
Sorry, that should've been "TLS before v1.2" not "TLS 1.2."
I have heard you harp about RNG's before (and I agree many are likely weak as the Lenstra, et al paper proved). Do you suspect this is mostly an embedded device issue (no interface devices to generate entropy) or do you think PRNG's in OpenSSL or GnuPG are also weak?
And, yes, you make a good point about the private keys. All the attacker needs is to break one guy's key and the attacker has immediate plaintext of every conversation that guy has had with others (even if the other keys are secure).
For instance, a couple of years ago I used to correspond on an encrypted mailing list with about 50 other people. Most everyone used GnuPG with 2048 bit keys (which should be secure). However, one guy was using 512 bit keys. I got to thinking if an attacker could factor his key (which in 2010 would be very easy and cheap to do), then *all* 50 of us would be instantly compromised. This is a real problem with public-key systems -- you have to be sure your correspondent(s) are using long enough keys and managing their keys properly. Enforcing this gets exponentially more difficult with the more contacts one has.
One the subject of RNG's, there was a project proposed a number of years ago by some cryptographer to put his own satellite in space that would generate huge quantities of truly random numbers and beam them to earth for instant OTP's. A TRNG for everyone on earth. His project never went through. Of course, I don't see any way he could avoid an oppressive government from injecting their own weakened numbers into his stream, thus compromising it. Or perhaps even jamming the signal all together.
I believe on Linux most programs like OpenSSL and GnuPG use /dev/random or /dev/urandom as RNG source. Is there a known home buildable device you could attach to a USB or other port that could generate better random data? I'd be interested in something like that as a hobby project just because it would be different than what others are using. Is it hard to get better random noise, thermal noise or something like that? I don't know about math to design something like this but I am adequately skilled to make something if the source method is proven.
Clive's comments about weak RNGs remind me of last month's news of the practical attacks against Chip-and-Pin systems:
Short version for those who didn't read it: a lot of ATMs use crappy RNG generators when choosing the "unpredictable numbers" that are supposed to assure that the transaction is fresh. The "pre-play" attacks are possible because attackers can generate these numbers, and carry out the protocol steps that depend on them, well in advance of the actual transaction.
Do you suspect this is mostly an embedded device issue (no interface devices to generate entropy) or do you think PRNG's in OpenSSL or GnuPG are also weak
Without a doubt many but by nomeans all embedded devices have RNGs with poor startup entropy, which then due to initial setup scripts make it into the device PK certs.
Likewise many software RNGs have this startup issue as well. Untill fairly recently many OS / C-lib / etc had realy poor RNG's and many programers failed to comprehend this or if they did failed to mitigate it correctly.
Even when a good RNG is available in the program or OS there is still the "initial start up" issue where entropy starts at zero and only builds up slowly over time. And also the problem of an attacker injecting faux "known to them" entropy into the generator in various ways.
So yes even the best of software designs can be compromised but in most cases the window of oportunity for an attacker is in the past, unless they can find a way to either compromise a system (such as you mentioned with the mailing list). Or in some way force you to generate a new PK Cert etc on a computer they have managed to compromise.
Oddly the problem of lack of initial start up entropy could be easier to solve with embedded devices than with shrink wrapped software only solutions. This is because most embedded devices get factory programed with serial numbers and other "unique to device" information such as Ethernet MACs prior to being boxed and shipped. It would not be overly difficult for the manufacturer to also put in a stored entropy file that is generated on the fly on the production line.
BUT could you trust them to implement it correctly, not keep records that are accessable and do what is necessary to prevent an attacker getting at the production line system in some way. Further what do you do about the "Reset to factory defaults" option in the embeded devices main menu...
With regards putting up a satellite that generates true random data there are many issues with this that have to be solved, not least of which is how do you solve the eavesdropping issue where both Alice and Eve have the same stream of TRNG data bits.
At the end of the day you realise there are only two solutions to this,
The first is that it is each user who has to be fully responsable for ensuring that not only do they start with True Random Data but that it is also both 100% unique to them as an individual user and more importantly compleatly unknown to others in perpetuity.
Or secondly you have a hierarchical system where individuals are deemed untrustworthy by definition by an organisation that then mandates and enforces every step in the process.
The first process is always going to be subject to the "weakest link" problem of an individual not taking responsability and using short cuts or weak keys etc. The second has the advantage of preventing the weak individual issue.
But the 100% unique and in perpetuity requirments do not go away and these are difficult even in a hierarchical system. This problem along with others has given rise to the notion that True Random Data is not a usable solution at anything other than the master secrets level. For instance 100% unique cannot be done securely with using TRNG data for every user because it requires a hugh database that represents a significant target of oportunity for an attacker. The solution is to use a determanistic system such as a block cipher in CTR mode, with non determanistic sampling of it's output. But the determanistic system needs to have a sufficiently large output size such that it can be run very rapidly for potentialy centuries and only use a very very small part of it's output. Which adds weight to Bruce's comment the other day that we don't have a block cipher that is sufficiently large in the number of bits in the block size (arguably you currently need 2048bit size RSA keys, but with legal land contracts currently lasting upto 1000years...).
It is without doubt a very hard problem to solve and whatever you do, you will always have cases where it will fail for some reason unless you have full and enforcable control at every step. So is it any wonder that the NSA / GCHQ types of this world have such dictatorial KeyMat proceadures ultimately backed up by life ending sanctions (life long prison terms / execution) for treason.
Is there a known home buildable device you could attach to a USB or other port that could generate better random data? I'd be interested in something like that as a hobby project
The first question is "What do you mean by better?"
The old definition used to be what are often described as True Random Number Generators, but these have only limited application for a whole host of reasons.
More modern thinking is based around fully determanistic algorithms with a small number of master secrets. For instance a counter that starts from a secret number, the output of this counter is then whitened (XORed with another secret) which is then encrypted with AES using a secret key. These sorts of generators are known as Cryptographicaly Secure Pseudo Random Number Generators (CS-PRNGs).
These master secrets are usually generated by a TRNG which has been very carefully designed to remove as many types of bias as the designers can think of and mitigate.
The heart of a TRNG is some kind of "physical noise source" of which many are available and few if any are "perfect" that is free of bias or other influence.
A simple noise source is thermal noise in a resistor, but it is of extreamly low level and has to be amplified. The amplifier adds it's own noise and any bias it has as well as any external interferance caused by EM fields, sound or other mechanical or thermal energy. The amplifier is likewise not perfect it has a frequency and phase response that alter the thermal noise signal and thus cause bias.
Thus you have to find some way of partialy mitigating the effects to an acceptable level. One way to do this is to use two oscilators one fairly high frequency the other a fairly low frequency. The high frequency oscilator drives the D input of a D-Type latch the low low frequency oscilator clocks the latch and subsiquent circuits. The high frequency oscillator is usually a very low noise low Q oscilator with high "tank energy" that has logrithmic amplitude stabalisation where the frequency is stabalised by a high Q resonator such as an XTAL which is lightly coupled to the oscillator, the output from the oscilator is again very lightly coupled to the tank circuit. This oscilator uses heavily decoupled power supplies and a large amount of screening to prevent external energy from EM, mechanical or thermal sources. The design of such oscilators can be found in many RF refrences as "secondary standards" and the ARRL, RSGB et al have published such designs for use in QRP receivers and test equipment. The second low frequency oscilator is of a similar design but importantly is designed such that the oscilator is not stabalised by an external resonator but has an aditional variable reactive element loosely coupled to the tank to change it's frequency. This variable reactive element is driven by the physical noise source. The result is the output of the high frequency oscilator is non determanisticaly sampled by the low frequency oscilator and is in effect a sub harmonic of the instantanious difference frequency between the two oscillators.
This output needs to be de-biased the start of this is to use a siple circuit that samples two bits and XORs them together (this is a von neuman de bias circuit) this along with the clock/latch signal is then Digitaly Signal Processed to remove other biases and also detect various failure modes.
Depending on the level of processing you want to do prior to the PC you can do the first stages in something like a PIC or equivalent micro controler that has an in built USB interface.
With a little thought you will realise that you can do the same thing with just the low frequency noise driven oscilator driving an interupt pin on the PIC and some kind of race circuit forground process. One way to do this is to have the forground process be a fast stream generator like ARC4, SNOW2 or any of the EU eStream finalists. The stream generator free runs continuously in the forground and gets sampled by the external interupt signal and this causes the current output byte from the stream generator to be sent over the USB port to the PC.
The important thing is to be very carefull in designing and isolating the noise source and low frequency oscilator such that external influance from power supply noise, EM "hum", mechanical "microphonics" or temprature changes are minimised as best as posssible. Whilst also continuously monitoting it for out of specification performance that would indicate some form of failure or attack.
What ever you do, do not fall into the trap of "magical thinking" which is best explified by Intel and their idea of using a hash on the output from the physical noise source. This does not improve entropy, nor does it remove bias, and what it does do is make it almost impossible to detect most if not all types of failure of the physical noise source (rumour has it that Intel very deliberatly did this to cover up that their on chip noise source was so bad it was virtualy usless).
Coming back to this article a few days later, I get the impression that this calculation is more a "best case" and "if nothing goes wrong" estimate.
It reminds me of those designing crypto themselves and thinking: "If I can't break it, nobody can".
None of the things an attacker may come up with were taken into account.
This whole analysis if seriously flawed.
While Moore's law doubles the transistors every 18 months it has not result in doubling processing ability over same period.
May I go back to Ben Brockert's question (Oct 5, 7:59) and k's followup (Oct 8, 8:13) about how being able to produce a collision, any collision, is going to help anyone make money by fraud? Presumably the bad guy with the massive computing power (accepting that some organized crime networks might be there when specified, or some governments) wants not just to produce an identifiable collision but to create an identifiable specific document - or part of a document, like a signature - that is not just any collision but a very specific overriding of particular data.
Is that the result of producing a first collision, or will one need years and years of computing power (even at 2022 speeds) to do it for each document, or each attempt?
Is that many real years further down the road, or is that a done deal with the first collision (or the first one whose description is published)?
"With the same $19k worth of those, a SHA-1 collision could be found in less than 2 weeks. A larger organization with something like $5 million to play with (so about 8,000 of those ASIC devices) could produce collisions in about 1 hour"
+1 I believe NSA's cryptanalysis hardware budget is quite a bit more than $5 million ;)
I'm not sure what the point is in calculating a COTS threat-model when the most capable adversary is almost certainly using custom-built equipment. There are, after all, just a handful of hash algorithms being used for the vast majority of hash-related security functions. As the leaks have shown, the vulnerabilities introduced by the most capable adversary creates opportunities for exploits by less capable adversaries following in their footsteps. Haven't we learned from the Snowden leaks that we can no longer afford to design security protocols to anything less than a standard of frustrating the most capable adversary on the most pessimistic - yet reasonable - estimates of its capabilities?
How much entropy is available over NTP? Maybe enough to seed hardware RNGs? If a hardware reset can't reset the seed, doesn't that go a long way?
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.