New RC4 Attack

This is a really clever attack on the RC4 encryption algorithm as used in TLS.

We have found a new attack against TLS that allows an attacker to recover a limited amount of plaintext from a TLS connection when RC4 encryption is used. The attacks arise from statistical flaws in the keystream generated by the RC4 algorithm which become apparent in TLS ciphertexts when the same plaintext is repeatedly encrypted at a fixed location across many TLS sessions.

The attack is very specialized:

The attack is a multi-session attack, which means that we require a target plaintext to be repeatedly sent in the same position in the plaintext stream in multiple TLS sessions. The attack currently only targets the first 256 bytes of the plaintext stream in sessions. Since the first 36 bytes of plaintext are formed from an unpredictable Finished message when SHA-1 is the selected hashing algorithm in the TLS Record Protocol, these first 36 bytes cannot be recovered. This means that the attack can recover 220 bytes of TLS-encrypted plaintext.

The number of sessions needed to reliably recover these plaintext bytes is around 230, but already with only 224 sessions, certain bytes can be recovered reliably.

Is this a big deal? Yes and no. The attack requires the identical plaintext to be repeatedly encrypted. Normally, this would make for an impractical attack in the real world, but http messages often have stylized headers that are identical across a conversation—for example, cookies. On the other hand, those are the only bits that can be decrypted. Currently, this attack is pretty raw and unoptimized—so it’s likely to become faster and better.

There’s no reason to panic here. But let’s start to move away from RC4 to something like AES.

There are lots of press articles on the attack.

Posted on March 29, 2013 at 6:59 AM23 Comments

Comments

Paeniteo March 29, 2013 8:35 AM

“But let’s start to move away from RC4 to something like AES.”

The interesting point is that many web sites actually moved from AES towards RC4 as a response against the recent BEAST and Lucky-13 attacks (these rely on block ciphers running in CBC mode which is nicely avoided by switching to a stream cipher).

The newer TLS versions that have builtin
protections against those attacks still are not wide-spread enough to provide a general solution. Maybe / hopefully, we will start to see some movement here.

Hanno March 29, 2013 8:39 AM

To be a bit more precise: People were moving away from AES-CBC in the very flawed version it’s implemented in TLS (using HMAC authentication before encryption, which was a very bad choice).

The way to go is TLS v1.2 with AES-GCM-ciphers, but that’s not supported by any mainstream browser today.

And that’s the part I find frightening. I don’t expect either this RC4 attack nor the AES-CBC-HMAC-issues to be a really big isse. But TLS is THE encryption protocol – the most important one at all. I don’t want this to be “flawed by attacks that probably don’t matter”, I want it to be rock-solid-secure.

Brian W March 29, 2013 8:49 AM

From what I’ve been heard, most website have bumped RC4 up to the top of the list of cipher protocols for TLS because of the BEAST vulnerabilities to AES and the other CBC approaches. So this is a big deal, at least until CBC gets fixed.

Steven Hoober March 29, 2013 9:42 AM

The attack requires the identical plaintext to be repeatedly encrypted. Normally, this would make for an impractical attack in the real world, but http messages often have stylized headers that are identical across a conversation

I have this /vague/ recollection of that, in decades past, being forced upon enemy commo systems for this same reason. Something like planting a specific piece of “lost” intelligence at a few far flung stations, so they’d type and send across the network, then you search for the patterns that match your known plaintext, and work onward from there.

It does bring up a good flaw in digital communications systems. If I was an advanced student in CS I’d totally take on a project to randomize or divorce headers from messages in cryptosystems to avoid this for any exploit. Seems fixable.

Clive Robinson March 29, 2013 10:25 AM

Perhaps the first question we should ask is “Why do we still use RC4?”

Yes it’s a neat little generator but it has a large number of issues.

The theoretical maximum number of states in the RC4 generator is 2^1700 with the distinct probability that none of the sequences generated from keys get even close to excercising this number of states.

Perhaps on this fact alone we should be looking at much better generators for a ssystem that could easily carry over half the internet traffic…

Dom De Vitto March 29, 2013 10:36 AM

I can’t figure how this could be used practically – TLS sessions are long-lived due to startup overhead, so collecting enough of the same session data to find out private information seems like a long wait. It would be different if the server/service was sending the same secret to millions of users, but then that information probably wouldn’t be ‘secret’….

Cunning ideas anyone?

Johnston March 29, 2013 10:48 AM

@Clive Robinson Perhaps the first question we should ask is “Why do we still use RC4?”

The answer to this question is the same as to the question “Why are we still using TLS?” The answer is lack of options deployed to any level of usefulness. The only other option in TLS is CBC mode, which is essentially broken in TLS. It almost seems intentional.

Nick P March 29, 2013 11:11 AM

@ Clive Robinson

“Perhaps the first question we should ask is “Why do we still use RC4?””

It’s simple. It’s very fast. The other side is likely to have it. It’s secure enough if used properly. I have RC4 in plenty of older designs, and lesser degree in a new one, for these reasons. Looking at every attack on RC4 on the record, it’s still secure enough for the use cases and uses less resources than “recommended” setups.

The good news is that todays embedded processors perform better than those from 5+ years ago. I switched to Salsa20 from DJB for fast stream ciphering designs. The European stream cipher competition gave us more to choose from. And many cheap processors are including AES acceleration, giving plenty reason to switch over to it in some kind of counter mode for stream cipher applications.

RC4 still has its’ use-cases, mostly legacy setups. But it’s days are numbered.

kyhm March 29, 2013 3:11 PM

This article gave me a feeling of deja vu; the PTW attack on WEP from back around 2007 used capture and re-injection of a legitimate ARP packet, the result was the AP would rebroadcast the same message many times with new IVs. Both exploit flaws in RC4, but I wonder if there’s a more general layer of defense you could add in the framing…

Why not remove the advantage of knowledge of the positions of bits of plaintext? Prepend a random number of random bytes to the message, using rules the receiver can use to reliably skip them to get to the actual message post-decrypt. It could be as simple as the first octet specifies how many octets to skip at a time, with a 0 length signifying the end of padding.

Even the relatively small embedded machines I was working on in 2004 could handle something like this at wire speeds, and while it’s not really practical to change a protocol like ARP, it should be possible to retrofit it to negotiated protocols like TLS and SSH. The extra bandwidth could discourage its use in some applications, but for others the tradeoff seems reasonable. I freely admit I’ve spent much less time on application protocols than L2 and L3 stuff, but I can’t recall ever seeing it done.

GregW March 29, 2013 4:32 PM

I know we’re talking ciphers (RC4) and not protocols (SSL/TLS), but this does remind me of a recent bugaboo of mine which I don’t feel I fully understand.

What is the primary reason browser support for TLS 1.2 so awful (see http://en.wikipedia.org/wiki/Transport_Layer_Security#Web_browsers ), when the RFC ( http://tools.ietf.org/html/rfc5246 ) is 4.5 years old?

Wouldn’t it make practical sense for players in the industry to try to, as a “defense in depth” step, try to implement a version ahead of what’s considered currently “unbroken”?

Neither open source (Firefox) nor closed source (Microsoft, IE) seem to have sufficient incentives to have done that. Which is particularly surprising for the latter given the “lag factor”– a webmaster hosting data for large corporate enterprise clients (many still internally enforcing use of IE8 for compatibility of various corporate intranet sites or licensed software) can’t really hope to turn off TLS1.0 for many many years, if IE 10 doesn’t even support TLS 1.2 by default today.

Clive Robiinson March 29, 2013 6:11 PM

@ kyhm,

I freely admit I’ve spent much less time on application protocols than L2 and L3 stuff, but I can’t recall ever seeing it done.

Ahh the innocence of youth 🙂

It you go back far enough to when paper and pencil ciphers were the only game in town there were various things tried to break up predictable behaviour that an analyst could latch onto.

One such thing was “padding with nulls” and another the delightfully named “Russian Coupling” where the encipherd message is split in two and the last part is put first with the first part last (providing the split position is orthagonal to any other encipher related lengths such as the key length etc it made breaking the cipher manually that bit harder).

The problem with both metthods was how to communicate the “magic number” that told the recipient either how many nulls or where the cut was for the russian coupling.

The usual arangment was to put it in as the last step of encryption at an agreed place (like the day of the month plus the day of the week etc).

If you are interested in paper and pencil or hand ciphers you might find this of interest,

http://www.umich.edu/~umich/fm-34-40-2/

neill April 1, 2013 2:44 PM

can we just insert a additional meaningless, random length random string into either the headers or the cookie?

or maybe just delete all cookies e.g. daily/hourly since the attack takes quite a while?

kyhm April 2, 2013 1:50 AM

@Clive

Ahh the innocence of youth 🙂

I guess I can claim the latter, somewhat, but the former…

Maybe I didn’t really convey what I was thinking.

On the one hand we have attacks which exploit knowledge of where fixed parts of the plaintext are in the ciphertext, or knowledge that multiple messages have the same bits of cleartext in the same places. While this attack doesn’t sound too terrible, I like to think that ARP-replay attack was what finally killed WEP.

On the other hand padding a message to remove structure clearly goes back a ways, however it’s done, and like many things computers can do it quickly and accurately to a much stronger degree than we humans. While there’s some additional computation or bandwidth cost it seems pretty cheap, and backwards-compatibility is probably the bigger obstacle.

What I’m curious about the trade-off: is there simply little benefit to randomizing the location of plaintext, in terms of frustrating attacks that depend on knowledge of its position or repeatability? It seems similar to what most OS’ memory management code does now to slow down code injection. Is the cost of shipping extra bits or skipping them not worth this possible benefit, given that programmers still have a strong performance bias? Or has it been done in protocols I’m ignorant of but found to be not worthwhile, even as a defense-in-depth measure?

Paeniteo April 2, 2013 5:45 AM

@neil: “can we just insert a additional meaningless, random length random string into either the headers or the cookie?”

Yes, we could.
For example HTTP as a protocol would allow for something like a “X-TLS-Hardening: herebesomegarbage” header in the requests and responses.
The spec says that unknown headers should simply be ignored (that’s where this whole X-… stuff comes from), so that should not cause problems. However, I cannot vouch for all implementations, be they browsers, servers, proxies, filters, …

neill April 3, 2013 7:22 PM

@Paeniteo:
guess we could make it that garbage is a random part of the later-to-follow message (or body)
that way you could compress the whole string very efficient, and do not need to waste time (or entropy) to compute random bytes

Clive Robinson April 3, 2013 7:49 PM

@ Neill,

Err a minor “nit pick” on your idea,

guess we could make it that garbage is a random part of the later-to-follow message (or body)

Contains that little weasel word “random” which negates the

… and do not need to waste time (or entropy) to compute random bytes

Whilst I understand your idea you still need a little “real entropy” to select your “random part” from the message to follow.

It would also require not just “real entropy” to select BUT the bit you select would also have to be genuinly unpredictable and that is actuall quite a hard problem when the follow message is likely to be fairly predictable in many cases.

neill April 5, 2013 5:38 AM

@clive:
appreciate your input

my idea was to prolong the already long time for this attack by “just a bit” –

long time x 256 = very long time

which would require only 8 or 9 bits of added “randomness”

but i guess that you could hide e.g. as “unprintable chars” – a comment, tags etc or print a character and then a backspace (do they still exist?)

Chris April 15, 2013 9:15 AM

2^24? OMG! As soon as I’ve logged in to facebook for the 16 millionth time, they might decrypt a few bytes of my cookies. SHRIEK!! We need to change all the algorithms NOW!!!!!!

Oh – wait up – I’ll have to log int to facebook 459 times every hour for the entirety of my life for that to happen…

Seth April 15, 2013 10:04 PM

In the 1970’s, I recommended compressing before encrypting. That ought to alleviate this problem as well.

Bear May 28, 2015 4:47 PM

The problem with compression before encryption is that then your opponent can tell what degree of compressible redundancy is in the stuff you’re transmitting.

If he knows, for example, that you’re transmitting what some higher-level protocol into 32k packets, then he can tell by the actual packet lengths he can tell whether you’re transmitting text, jpg’s, mp3’s, or HTML files.

And it gets worse if your opponent can control some part of what is being encrypted. For example, if your attacker is providing advertising that your server is embedding in a web page, then he can tell when something in your cookies matches something in his advertising by how much the page compresses. Add a comment string in the code that he can vary at will, and he’ll be reading people’s cookies within one day if they keep going to pages that contain his ad.

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.