The Crypto Bomb Is Ticking
By Bruce Schneier
Today's faster, less expensive computers can crack current encryption algorithms easier than ever before. So what's next?
Cryptographic algorithms have a way of degrading over time. It's a situation that most techies aren't used to: Compression algorithms don't compress less as the years go by, and sorting algorithms don't sort slower. But encryption algorithms get easier to break; something that sufficed three years ago might not today.
Several things are going on. First, there's Moore's law. Computers are getting faster, better networked, and more plentiful. The table "Cracking for Dollars" on page 98 illustrates the vulnerability of encryption to computer power. Cryptographic algorithms are all vulnerable to brute force--trying every possible encryption key, systematically searching for hash-function collisions, factoring the large composite number, and so forth--and brute force gets easier with time. A 56-bit key was long enough in the mid-1970s; today that can be pitifully small. In 1977, Martin Gardner wrote that 129-digit numbers would never be factored; in 1994, one was.
Aside from brute force, cryptographic algorithms can be attacked with more subtle (and more powerful) techniques. In the early 1990s, the academic community discovered differential and linear cryptanalysis, and many symmetric encryption algorithms were broken. Similarly, the factoring community discovered the number-field sieve, which affected the security of public-key cryptosystems.
There are many encryption algorithms currently available; see the table "Encryption Algorithms: Suitability to Task" on page 100 for classes of algorithms and their characteristics. What algorithms are considered secure today? What about the future? Predictions are dicey at best, but they are essential in the business of cryptography.
For instance, if I design a cryptographic system today, it may spend two years in development and be fielded for another dozen. The information it carries might have to remain secure for six years after transmission. This means I'm forced to make a decision today about what the state of cryptography will be 20 years from now. Like it or not, cryptographers have to be futurists.
Symmetric algorithms use the same key for both encryption and decryption. These algorithms are the workhorses of cryptography. They encrypt anything digital, including e-mail, telnet connections, audio, and video. You can divide symmetric algorithms into two piles: stream ciphers and block ciphers. Stream ciphers encrypt data in streams--a bit, byte, or word at a time. Block ciphers encrypt data in fixed chunks, generally 64 bits. Since you can use block ciphers to build stream ciphers, most ciphers are block ciphers.
And there are a lot of them. The table "Symmetric Algorithms and One-Way Hash Functions" on page 102 lists the major ones, comparing speed, block size, key size, and patent and licensing terms.
A few points are worth noting here. Triple-DES is the conservative choice, although it's the slowest. Everything else has received far less cryptanalytic attention. Blowfish is the fastest algorithm, but it has a long key-setup time and isn't suitable for encrypting small blocks. And its large tables make it completely unsuitable for smartcards. The International Data Encryption Algorithm (IDEA) got its fame as the encryption algorithm in PGP, and it would probably be used more if it weren't impossible to get any reasonable licensing terms out of Ascom-Systec. CAST, the new algorithm in PGP, is similar to Blowfish. RC2 has been largely abandoned in favor of RC5.
If I had to choose an encryption algorithm to use today, I'd select Triple-DES (it's much harder to break via exhaustive searching than DES), IDEA (it has survived since 1991 without any serious cryptanalysis), or Blowfish (it's fast, compact, and simple; allows variable key lengths; and has been the victim of no known successful cryptanalysis). A couple of years ago, I and a group of other cryptographers recommended a 90-bit key as the bare minimum for security today (see why in the table above); all three of these algorithms exceed that key length.
The block-cipher landscape will change soon. The National Institute of Standards and Technology (NIST) is soliciting candidate algorithms for an Advanced Encryption Standard (AES), which will replace DES. None of the aforementioned algorithms is suitable because AES must have a 128-bit block size and key lengths of 128, 192, and 256 bits.
Candidates are due in June of this year, and NIST will select an algorithm sometime in 1999 or 2000. Thus far, about 15 people have indicated that they will submit an algorithm. Most will likely be pretty awful attempts by crypto wannabes, but expect to see an RC5 variant from RSA Data Security, a CAST variant from Entrust Technologies, a variant of Square, and Blowfish II (based on Blowfish).
Those who need a stream cipher have two choices. One option is to use a block cipher in stream mode. This isn't difficult; any block cipher will work, and you can consult any cryptography text to find out how to do it.
The other option is to use a dedicated stream cipher. There are a few of these, some optimized for custom hardware and others for 32-bit microprocessors. RC4 is one common choice. Once a trade secret of RSA Data Security, it was "outed" to the Internet in 1994 and has since become public-domain. There's even an Internet Draft for something called ARCFOUR, which is actually RC4: The draft's authors didn't want to use the real name.
No one has succeeded in breaking RC4, but cryptographers have found some statistical anomalies in it, including weak keys. When I recommend RC4, I advise the use of a more complicated key schedule, namely spinning the key schedule twice. A key schedule is an algorithm that expands a relatively short master key to a relatively large expanded key for use in encryption and decryption. RC4 uses a key schedule to initialize the state of the stream cipher prior to generating the keystream. Due to weak mixing in its key schedule, RC4 has a class of detectable keys. The more complicated key schedule strengthens that.
Making a Hash of It
A hash function is a fingerprint function that takes an arbitrary-length input (i.e., a pre-image) and produces a fixed-length output (i.e., a hash value). Given a digital blob (a file, a message, or whatever), it's easy to calculate the hash of that file. But given a hash value, it's hard to create a file that hashes to that value. It's much easier (by a factor of 2 superscript n/2 for an n-bit hash function) to find collisions for a hash function than to reverse the function. This means that the function is "one-way," unlike something like a cyclic redundancy check (CRC), where it's easy to create a file with a given CRC value.
Hash functions also have to be "collision-free": It must be hard to find two files that hash to the same value (see the figure "Cracking Hash Functions with Collisions" on page 100). This means that if I give you a hash value and then later show you a file that hashes to that value, you can be sure (to the extent of the security of the hash function) that I had that file at the time I created the hash value.
Most hash functions in use a few years ago produced 128-bit hashes, although for any long-term security a 160-bit hash is de rigeur. I wouldn't use a 128-bit hash for anything unless I had a really good reason for doing so.
There are only a few suitable algorithm choices here. I recommend SHA-1, from the National Security Agency (NSA). SHA, the Secure Hash Algorithm, is a NIST standard. It produces a 160-bit hash. Another choice is RIPE-MD-160, from the European Community. MD5 produces only 128-bit hashes, and cryptographers have found weaknesses in the algorithm. Don't toss your MD5-based applications immediately, but you should switch them over to SHA as soon as possible.
Not much is forthcoming in this category. A lot of work has been done on creating hash functions from block ciphers, but no single proposal has emerged as a front-runner. People are likely to stick with SHA-1 or RIPE-MD-160, although some ultra-high-end applications use a combination of MD5 and SHA-1.
Message-authentication codes, or MACs, are hash functions with a key: Only someone who knows the key can create or verify a MAC value. For instance, I can use MACs for integrity checking. After creating a file, I calculate its MAC (using a key you and I share) and append it to the file. Anyone can read the file, but only someone who knows the key can create a new MAC. When you get the file, you can calculate the MAC and verify that it's the same as the MAC I sent you. If it is, you know that no one has tampered with the file during transit.
There are several MAC algorithms, but the most promising ones are HMAC and NMAC. They're based on hash functions, generally SHA-1; HMAC is an Internet Draft (RFC 2104). There are other MAC constructions based on hash functions, but they're not nearly as good. And I haven't seen any MACs based on block ciphers that I've liked very much.
Playing the Public Keys
Public-key algorithms are the surprise of the 1970s. Encryption and decryption use different keys and, more important, you cannot calculate these keys from each other. Thus, you can generate a key pair and publish just the encryption key (see the table "Pick Your Enemy, Then Your Key" on page 102 for recommended key lengths). Anyone can get that key and send you a message that only you can decrypt (using the decryption key). You can use the same math for digital signatures (more on this later). The actual implementation is complicated, with certificates, certificate authorities, trust management, and lots of odds and ends, but that's the general idea (see "Picking the Crypto Locks," October 1995 BYTE).
Public-key algorithms are based on one of two problems: the factoring problem and the discrete-logarithm problem. Both are more or less equally hard, so I'll discuss the factoring problem with the understanding that what I say applies to both. Factoring the kind of numbers used in public-key cryptography--1024 bits or more--is hard. It would take all the computers in the world years to accomplish.
Patents are another issue to deal with when choosing a public-key algorithm. RSA is patented, for example, and will remain so until 2000. Diffie-Hellman and ElGamal are both in the public domain: Any patents that might have applied expired last year. I usually use ElGamal, unless I have a strong reason to use RSA.
Digital-signature algorithms are simply public-key-encryption algorithms turned on their ears--the private key is used for signing, the public key for verification--and all the key-length discussion in the previous section applies. RSA digital signatures are just as secure (or as vulnerable) as RSA encryption. The same is true for ElGamal. There are other alternatives: NIST has endorsed something called the Digital Signature Algorithm (DSA) as a federal standard. And there are several elliptic-curve algorithms.
Elliptic curves are the newest kids on the block. Points on an elliptic curve form a mathematical group: Given any two points, there are operations that always produce another point on the curve. Further, you can use a number and a point on the curve to give another point on the curve--but it's hard to figure out what number you used, even if you know the original point and the resulting point. This one-wayness leads to cryptographic applications. And because cracking this is much harder than cracking other systems, it requires much smaller keys to obtain comparable encryption.
This normally wouldn't be interesting, but the fast algorithms used for finding discrete logarithms don't work with elliptic curves. Therefore, proponents of this technology argue, you don't need as long a key. For applications where bits are very dear, like smartcard applications, these are enticing words.
The question is not whether elliptic-curve cryptosystems are secure, but whether they offer the same security with shorter key lengths than comparable systems. Today, the discrete-logarithm problem for elliptic curves is harder than the discrete-logarithm problem modulo prime numbers p: There is no subexponential algorithm for doing the calculation. (The discrete-logarithm problem involves finding the exponent to which you must raise a given number to generate a value modulo some large prime number.) Thus, people use elliptic-curve cryptosystems with significantly shorter key lengths.
The figure "Elliptic-Curve Cryptography vs. RSA and DSA" on page 99 illustrates the difference between key sizes required for comparable security.
It's unknown whether the discrete-logarithm problem is harder because of the fundamental mathematical nature of elliptic curves or because of our limited knowledge of their mathematical properties. This is a very new area of research, and recent discoveries indicate that there is still much theoretical work to be done. I do not recommend assuming that they can provide the same security over the long term with shorter key lengths than cryptosystems using discrete logs modulo prime numbers p.
Security? What Security?
Popular magazines like to describe cryptography products in terms of algorithms and key lengths. Algorithms make good sound bites; they can be explained in a few words and are easy to compare with one another: "128-bit keys are good security." "Triple-DES means good security; 40-bit RC4 means weak security." "2048-bit RSA is better than 1024-bit RSA." Unfortunately, reality isn't that simple.
Longer keys don't always mean more security. Compare the cryptographic algorithm to the lock on your front door. A door lock might have four metal pins, each in one of 10 positions. Thus, there are only 10,000 possible keys, and a burglar willing to try all 10,000 is guaranteed to break into your house.
But an improved lock with 10 pins, making 10 billion possible keys, probably won't make your house any more secure. Burglars don't try every possible key (a brute-force attack); most aren't even clever enough to pick the lock (a cryptographic attack against the algorithm). Instead they smash windows, kick in doors, disguise themselves as police officers, or rob keyholders at gunpoint. Better locks don't help against these attacks.
I've spent years designing, analyzing, and breaking cryptographic systems. I do research on published algorithms and protocols, but most of my work consists of examining actual products. I've designed and analyzed systems that protect privacy, ensure confidentiality, provide fairness, and facilitate commerce.
I can almost always find attacks that bypass the algorithms altogether. I don't have to try every possible key, or even discover flaws in the algorithms. Instead, I exploit errors in design, implementation, and installation. And most of the time I exploit the same old mistakes that implementers make over and over again.
The moral here is not that cryptography is useless, but that cryptography is not enough. Strong cryptography is not a panacea. Focusing on crypto algorithms while ignoring the other aspects of security is like defending your house not by building a fence around it but by putting a single immense stake into the ground and hoping that your adversary runs right into it.
Security designers occupy what Prussian general Carl von Clausewitz called "the position of the interior." A good security product must defend against every possible attack--even attacks that haven't been invented yet. Attackers, on the other hand, need to find only one security flaw in order to defeat the system. Moreover, they can cheat: They can collude, conspire, and wait for technology to provide them with additional tools. They can attack a system in ways the system's designer never thought of. They can ignore the algorithms.
Building a secure cryptographic system is easy to do badly and very difficult to do well. Unfortunately, most people can't tell the difference.
In other areas of computer science, functionality serves to differentiate the good from the bad. For instance, a good compression algorithm will work better than a bad one; a bad compression program will look worse in feature-comparison charts.
Cryptography is different. Just because an encryption program functions doesn't mean it's secure. What happens with most products is that someone reads Applied Cryptography, chooses an algorithm and a protocol to use, performs tests on it to make sure everything works, and thinks the job is done. It's not. Functionality does not equal quality, and no amount of beta testing will ever reveal a security flaw.
Too many products are merely "buzzword compliant": They use secure cryptography, but they are not secure.
Cracking for Dollars
An expenditure of twice the dollars makes your adversary's attack twice as fast. By Moore's law, these attacks will be 10 times less expensive (and 10 times faster) every five years.
FPGA = field-programmable gate array
The Speed of RSA
RSA speeds in seconds (on a SparcStation 2 ) for different modulus lengths with an 8-bit public key.
Encryption Algorithms: Suitability to Task
Symmetric Algorithms and One-Way Hash Functions
N/A = not applicable.
Pick Your Enemy, Then Your Key
Recommended public-key lengths (in bits)
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.