Schneier on Security
A blog covering security and security technology.
« The NSA's Perfect Citizen |
| Friday Squid Blogging: Hawaiian Bobtail Squid »
July 16, 2010
Skype's Cryptography Reverse-Engineered
Someone claims to have reverse-engineered Skype's proprietary encryption protocols, and has published pieces of it.
If the crypto is good, this is less of a big deal than you might think. Good cryptography is designed to be made public; it's only for business reasons that it remains secret.
Posted on July 16, 2010 at 12:08 PM
• 25 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Its a modified version of RC4, right?
Then its likely to suffer the same fundamental flaws, including the weak key schedule. So, as long as they learned their lessons from the WEP cracking and didn't simply concatenate the key and nonce together, it might be okay.
Encryption that is hidden is rarely good.
Given the quality of their desktop app, I wouldn't like to rely on them following best practices.
As is normmal for me it's not the actual crypto that worries me it's the nitty gritty implementation details.
Afterall it is not unknown for initial "negotiation" or hand shake protocols to drop transparently to the lowest common denominator, which in some past systems has been plain text...
_he devil is always in the details and rarely do you get to see those details as it requires "full Open Source disclosure" to achive...
is it any good? what can be said about it?
whether or not it's a big deal depends on what you wish to do with it. oss client, perhaps.
The Skype algorithm was reverse engineered by some guys at MSR about 2 years ago. They reported the fact to Skype at the time and also presented their findings at an FTE-only event. I don't know whether the algorithm has been improved since then. It was definitely "security by obscurity" rather than a real crypto solution, albeit an extremely elaborate one.
Maybe this will help answer the question of whether Skype has built in any backdoors, to help out people like the NSA and GCHQ who want to intercept conversations.
Of course, if the NSA wants to listen to your calls, they can probably just slip a rootkit into your machine that captures the audio and forwards it to them covertly, regardless of what sort of encryption is used in transit.
I'm still waiting to see if someone will analyze the key schedule for the RC4 cipher. I thought I heard that it was customized, so hopefully they improved it.
But improving cryptography is hard, so it will be interesting to see if they did it correctly.
Elaborate were all the means they used to hide the true algorithm.
If the presentation by Philippe Biondi and Fabrice Desclaux from BlackHat 2006 (Silver Needle in the Skype) is accurate, then this isn't the encryption used in the Skype protocol, but merely the seed expansion function to provide a key stream used for obfuscation (slide 42). They used Skype itself as an oracle, and were able to read the de-obfuscated packets, but I think that encryption (based on AES if Tom Berson's report is to believed) is used *inside* the obfuscated stream (see slide 50's mention of almost everything being ciphered and the Enc block's position in the diagram).
It will be interesting to see their talk at CCC in December.
The Skype *protocol* ago was reverse engineered some time and is discussed in many places, including two reasonably detailed articles on Wikipedia. This publication is about additional detail in a particular layer, called the obfuscation layer.
This publication does *not* affect voice and video data, which are encrypted under AES-256 using a unique session key set up by asymmetric cryptography. This part of the protocol has been analysed and so far to date, it seems sound. What is affected here is the "obfuscation layer". Other than voice and video data, *all* other data exchanged between Skype servers and peers is obfuscated by RC4. It is obfuscation rather than encryption because the key is derived from externally observable data, but the derivation uses an unknown algorithm. What seems to have been published here -- and it remains to be verified -- is the key derivation algorithm for the obfuscation layer.
What this means is that it may now be easier for an attacker to interfere with call control. I would speculate that worst case might be an attack which simultaneously causes many calls to disconnect. Possibly -- and this is rampant speculation here, I don't known anything like enough about the protocol to be know if this is possible -- an attacker might be able to re-reroute call set-ups to pass through his supernode, so he can tell when and how often a pair of clients connect. A sort of Skype pen register, if you will. However he won't be able to intercept the calls themselves, and would't be able to listen in even if he could.
good explanation, however the implied call security, due to AES encryption, assumes that the attacker would even bother trying to brute force AES. I've always assumed a side channel attack or an attack on the key exchange protocol.
I've always assumed that a simple trick like intentionally jittering the packet launch and encoding the voice information in the packet launch jitter would work well to defect Skype encryption. I assume this launch jitter can be achieved with a simple worm.
The trick would be to maintain the jitter throughout the routing network.
Anyone can join the skype p2p network. But everyone i a p2p network will have to ibide to the protocols used. If there is a roque client they can do all kind of mishaps that is hard to protect against. Spam will be the most visible one. Now you could make the network more resistent by checking all other nodes you connect to, but this is not easy.
It is much simple to obfuscate the protocol, so thirth party apps can not join the p2p, and only well-behaved apps can access the p2p network via the API.
Again, protecting a p2p network against deliberate abuse has very little to do with encryption, and much more with network protocols.
"[Against AES] I've always assumed a side channel attack or an attack on the key exchange protocol"
When it comes to AES and side channels on modern CPU's it is effectivly broken.
There are several ways by which it can be attacked and the solution is the old TEMPEST "clock the inputs and clock the outputs".
You can read a bit more "general purpose" reading on it in the IEE "crypto corner" article by Nate Taylor at,
Skype's using the Win CryptoApi, no big deal here, already known. Move along.
Thanks for linking the article, this is exactly what I'm talking about.
The advantage of side channel attacks is that they can also defeat all the firewalls and other impediments that the system / user might create, to protect the information stream. As far as the system is concerned there is only one process communicating with the external world, of course, unbeknown to the user, their desired comms channel contains a buried covert channel.
I was actually suggesting a variant of a timing attack, which you allude too by reference to the old Tempest axiom "clock the input clock the outputs"
However what is missing from this axiom is that the clock must be guaranteed stable at 1/10 information density of the application to be protected. For properly protecting voice this implies a very stable source, with vibration and temp protection = very expensive (test instrument quality clock)
This is something that is ALWAYS overlooked by the young guys, especially wrt voice encryption. I've seen someone implement clocked I/O and than allow FIFO fails on the I/O buffer section. He even accused me of "cheating" when I constructed parallel process to force FIFO fails. Naturally the covert channel was encoded in the FIFO fails. System meet all Tempest requirements but still leaked information.
In the end analysis, the problem of securing voice comms, is the incredibly low information density of real Voice communication. So any time that the communications channel bandwidth / noise ratio substantially exceeds the Shannon link criteria (wrt the real voice information density) you must assume the possibility of a covert sub channel.
Typically voice communications only have between 10bps and 100bps actual information. So all good secure comms systems will try to encode voice with a codec requiring less than 1000bps. Skype does exactly the opposite preferring to get greater voice fidelity by using a very high bit rate encoder. Burying an undetectable 100bps covert channel in a 1000bps (Shannon limited) comms channel is extremely difficult. However burying 100bps in a 64Kbps stream, flowing over a 10Mbps comms channel is absolutely trivial.
Of course, I can always base my voice security decision on the, wondrous mathematics of modern cryptography, and assume that the NSA (GCHQ) know less about covert comms channels than I do......
"However what is missing from this axiom is that the clock must be guaranteed stable at 1/10 information density of the application to be properly protected."
True(ish) for a side channel that leaks data accidently (but I would put the divide rate higher, and as we are talking voice and possibly just phonem envolopes at 3 or 4 a second so we are looking at milli Hz stability.
With a covert channel it depends on the infomation density that the "attacker is trying to leak and that could be just a master key at one or two bits a minute (the majority of voice calls don't need to be monitored in real time more than 99% of them not even within an hour...) Usually the routing information is more urgent than the content.
All of which means the old axiom of clock the inputs and clock the outputs only works so far.
Another two axioms that help are "fail hard" and "fail long". That is on any timing error assume it's a covert channel and close both the communications signal and control channels. Don't ever try and correct any timing error as this as you noted gives rise to an external attack.
Importantly don't try and re-establish coms for a long period which needs to be atleast semi-random in duration. It should also be known what this back off duration is to both ends to remove another covert channel issue.
Now I know that Skype is not going to do either of those simply because it negativly impacts the users view of the quality of the service.
Another EmSec axiom people realy realy should remember is "data should be constant in all channels".
For instance if you use compression it gives away the envolope of voice comms which some older musicians may remember is all a vocoder uses to make a violin not just weep but talk as well ;)
Therefore you have to rate stuff back to a higher data rate, and again you have to be very carefull how you do that. Simply because it's fraut with side channels not just in the comms signal channel but in the control channel as well (no point taking the envelope out of the signal channel if your "data stuffing" "signalling method" reproduces it in the comms control channel).
Which for side channels implies as you say,
"For properly protecting voice this implies a very stable source, with vibration and temp protection = very expensive (test instrument quality clock)"
Which is effectivly impractical to do even with modern mobile phone Xtals.
However it's worse, it cannot be done for covert channels as I said they might leak data at millibit rates.
The solution to this is to deliberatly introduce "jitter" into the system effectivly you spread spectrum modulate the clocking in and clocking out clock at a rate just below the bottom of the audio band. The spreading code needs however to be highly nonlinear and known to both ends of the channel. This then effectivly encrypts any low bandwidth timing based covert channel to an external entity. It also has other advantages as the CCITT have shown for years with "channel whitening" with short LFBS's to spread the energy across the channel bandwidth.
Oh and this clocking in clocking out and spreader needs to be fully independent of anything else...
It also has another use in that it shows if either party is moving and if somebody is trying to time remodulate the signal post output clocking etc ...
Oh and of course there are other wrinkles that need to be ironed out as I often say "the devil is in the details" >;}
> whether or not it's a big deal depends on
> what you wish to do with it. oss client, perhaps.
One that actually works, perhaps, in the sense of "playing nice" on a desktop where the user wants to be able to play sound from other applications too? Maybe even one that works with esd *and* Pulse? Maybe even one that respects the system-wide color settings?
I know, I know, that's crazy talk.
If some one wishes expert view about running a blog after that i recommend him/her to pay a quick visit this web site, Keep up the fastidious job.
I am extremely inspired with your writing talents as well as with the layout on
your blog. Is this a paid subject matter or did you modify it your self?
Anyway stay up the nice high quality writing, it's rare to look a great weblog like this one nowadays..
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.