Schneier on Security
A blog covering security and security technology.
« Israeli Barrier Around Gaza |
| Privacy Enhanced Computer Display »
September 13, 2005
Snooping on Text by Listening to the Keyboard
Fascinating research out of Berkeley. Ed Felten has a good summary:
Li Zhuang, Feng Zhou, and Doug Tygar have an interesting new paper showing that if you have an audio recording of somebody typing on an ordinary computer keyboard for fifteen minutes or so, you can figure out everything they typed. The idea is that different keys tend to make slightly different sounds, and although you don't know in advance which keys make which sounds, you can use machine learning to figure that out, assuming that the person is mostly typing English text. (Presumably it would work for other languages too.)
Read the rest.
The paper is on the Web. Here's the abstract:
We examine the problem of keyboard acoustic emanations. We present a novel attack taking as input a 10-minute sound recording of a user typing English text using a keyboard, and then recovering up to 96% of typed characters. There is no need for a labeled training recording. Moreover the recognizer bootstrapped this way can even recognize random text such as passwords: In our experiments, 90% of 5-character random passwords using only letters can be generated in fewer than 20 attempts by an adversary; 80% of 10-character passwords can be generated in fewer than 75 attempts. Our attack uses the statistical constraints of the underlying content, English language, to reconstruct text from sound recordings without any labeled training data. The attack uses a combination of standard machine learning and speech recognition techniques, including cepstrum features, Hidden Markov Models, linear classification, and feedback-based incremental learning.
Posted on September 13, 2005 at 8:13 AM
• 72 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Great, another good reason to use PasswordSafe instead of typing in the passwords yourself.
I remember hearing something similar back in the "old days", where you could "wiretap" an IBM Selectric typewriter by putting a monitor on the power line.
Yes, the Soviets did that to U.S. typewriters in the Moscow Embassy. I'm sure we did it back to them.
It's just another Tempest attack ;)
Same Bandwidth / Energy limitation.
In the first Quantum Cryptography device the polarizing cell made so much noise that you could tell what the polarization was.
Also shortly after the Second world war it was reported that the US government broke the UK automated One Time Pad system (Telekrypton/Rockex) used from New York to the UK during the war.
There are differing reports, the first indicated it was by looking at slight timing diferences caused by the output relay visable with an oscilloscope across the line. The second indicated that they where in an adjacent room (in the Rockafella Center) and used an early equivalent of a spike mike to listen to the unit. Either way the Rockex was replaced with the Rockex II in 1944.
Shortly after that the Rockex II was replaced with a modified version of the Canadian RM26 for F&CO communications.
Wow, this is scary. I know it is just another tempest attack, but, I wonder what the countermeasure would be. This would break even people who type in a different keyboard layout since it doesn't care what layout you use.
Want to get a new keyboard every 10 minutes, anyone?
Tim, just change keyboard layouts in software every ten minutes. The increased security is worth the intense confusion you'll experience :-)
Of course, "they" will figure it out eventually, but you might make it an hour instead of 10 minutes. Heh.
The only simple way I can think of to defend against this is to use a ZX-81 style membrane keyboard, or perhaps one of those projected ones that have been developed for PDAs.
If each key makes a characteristic sound, then the map from text to keysounds is simply a substitution cipher. We've been able to break those for a very long time.
So why is so much machinery needed?
I suspect that the keysounds aren't quite characteristic. There's a range of sounds that a particular key might make and the sound ranges for keys overlap.
So this is not exactly parallel to TEMPEST.
"So why is so much machinery needed?"
Read the paper; it goes into all the technical details. It's really impressive research, I think.
"Wow, this is scary. I know it is just another tempest attack, but, I wonder what the countermeasure would be."
In many different areas, surviellance is more and more a "my technology is better than yours, so I win" game. If I can hide a microphone or a camera in your private space, I win. If you can detect my equipment, you win.
You could seed the data with non-english input to render frequency analysis more difficult. Perhaps having a window popup every minute or so prompting the user to copy a string containing a high number of Qs and Zs.
Don't worry about hardware keystroke loggers plugged into your keyboard any more, just watch out for microphones? Imagine an unpatched copy of Skype or some other VOIP software running on your machine and an appropriate worm could lead to someone listening in to your keyboard via your own sound card and capturing the traffic over the web... worms of the future!
I imagine any suitable acoustic white noise or background noise that would baffle the recording mechanism would be a suitable countermeasure. I guess if you're really worried, play some Spinal Tap at 11.
I guess we should start using voice recognition instead of keyboard input. Oh, nevermind...
If my keyboard emitted ten to twenty keystroke sounds for every keystroke I actually made, the mapping would be far more difficult, especially if I touchtype: every keystroke sound, real or bogus, would be overlapped by a dozen or so other keystroke sounds. Add to this ten or twenty sounds of each of the shift keys per real stroke, and the mapping would be very difficult.
The moment I touch the keys, even without pressing for positive contact, the bogus strokes should start. When I take my fingers off the keys, the bogus strokes should stop.
This is a really interesting development, which doesn't seem to have any sensible countermeasures (apart from locating the equipment, of course). Sadly, a compromised PC (as described above) may remain undetectable, unless the network traffic it created could be monitored.
Well, it shows yet again why authentication keyfob/credit cards like RSA's SecurID are a good idea - passwords alone don't cut it.
It also means that absolutely paranoid data security types will not only remove camera cellphones from visitors, but also any sound recording equipment. Perhaps we will all end up literally "going naked into conference chambers".
Given the use of 'chip and pin' keypads to authorise credit card transactions, does this now mean we have to look for microphones before punching in our PINs? Doesn't this make PINs less secure than good old handwritten signatures?
Or you could use one of the Tempest shielding techniques. Put your keyboard in close proximity to a whole pile of other keyboards... Now I finally have a use for that room full of an infinite number of monkeys with keyboards!
I wonder how well this works with random background noise being played over the sound of the typing. I've heard of office equipment that plays random babble back so that you can have a private conversation in a large office (it randomises speech that it picks up and plays it back, so it becomes meaningless speech like sounds). The same technique could possibly hide the sound of typing.
No other way to ask this - can you provide any links or references re the US/UK Telekrypton/Rockex item. It sounds quite interesting.
- smith at canada com
Now, I'll go out and get 1000000 monkeys to create typing noise, and maybe someday they'll also present me that script for Hamlet they wrote.
This reminds me of work done by Adi Shamir & Eran Tromer that analyzes the sounds made by a CPU to find the RSA private key. (http://www.wisdom.weizmann.ac.il/~tromer/acoustic/)
Obvious countermeasure: a white noise generator as loud or louder than the keyboard.
Less obvious, but less intrusive: the same, but just surrounding the keyboard via directed (interference-generated) sound.
Ah, at last a good excuse to crank the amp round to eleven ;)
If y'all think this is bad, just wait until they get floating spy nanobots. In 15 years privacy will no longer exist. Hell, they could make nanobots that float into your mouth, work their way into your brain, and read your thoughts directly.
be careful you don't improve the signal to noise ratio of the keyboard sounds by stochastic resonance.
Interestingly, I imagine it would change based on choice of editor, too.
I'd be hitting escape and probably i and c a disproportionate amount of the time. I expect someone using emacs would hit ctrl an awful lot...
Any assumptions that expected frequency of a symbol in the text correlates to the expected frequency in the keystrokes would likely be faulty.
Just put an analog of the keyboard on the screen and click on each letter with the mouse. It'll slow you down something awful, but every click will sound exactly the same ... ;-)
Programmers using any editor would type a different frequency of punctuation depending on the language they're using. If you know which language it is, that may not be as much of an issue, because you can just adjust the algorithm.
Oh, BTW, this attack wouldn't work on me, since I use a TouchStream keyboard. Hehehe. It's two huge touch-pads with keyboard markings. (You use it as a mouse too, and different combinations of finger movements can be associated with hotkeys.) The company that makes them is out of business, though.
It seems the counter measure is simple: make sure the variance in sound on any individual key is greater than the variance between keys. It doesn't really matter whether this is done by introducing more noise in an individual keypress, reducing differences between keys, or both. This is, of course, in addition to more general counter measures like site security.
It probably is not at this level of sophiscation, but even TouchStream could be vulnerable because theoretically, I don't think we really hit the same with every finger.
I've always reviled the old DEC keyboards that had an electronic beep for each keypress.
But I wonder whether having a constant simultaneous sound would interfere with their key recognition?
Or would they be able to just filter it out?
For very secure environments, this is not a big deal since you already must have taken care of the "sound problem." Afterall, people will be talking about the secrets you are trying to protect in secure spaces. You need to protect that first. So when is this a big deal? In an open office environment where users might wish steal each other's passwords? Public kiosks? In either case lower-tech attacks are probably more of a threat like simple shoulder surfing.
You could bug the home office of a system administrator, and if the admin were ever to log in remotely to his office system with a high-permission account, you would then have the password to that account.
About a year ago, I wrote about a similar technique for listening to keystrokes.
But even a low-tech version can be entertaining: try guessing a co-worker's activity just by listening to them type. Even without sophisticated equipment, you can usually determine someone's activity by listening to the sound and rhythm of their keystrokes (IM has lots of short sentences punctuated by ENTER, urls usually begin with 'www', etc.).
Fortunately I have touchstream. A pity fingerworks is no longer selling them.
"Given the use of 'chip and pin' keypads to authorise credit card transactions, does this now mean we have to look for microphones before punching in our PINs? Doesn't this make PINs less secure than good old handwritten signatures?"
That's a hell of a PIN that takes you 10 minutes to key in.
Good point. However, I scan-read the pre-print and didn't notice it being typist dependent. The problem space is smaller with the keypad - far fewer keys so the audio-processing attack may still be viable even for different users. Perhaps you need only 10 minutes of the sound of key depressions, and not 10 minutes of *one persons* key depressions.
On the other hand, using an infra-red camera to image the keypad after use would also work, if someone were that desperate. I believe that is a technique that has been used successfully on door-entry systems. However, I've digressed off-topic. Sorry.
"The problem space is smaller with the keypad - far fewer keys so the audio-processing attack may still be viable even for different users. Perhaps you need only 10 minutes of the sound of key depressions, and not 10 minutes of *one persons* key depressions"
The technique is based on the assumption that the text being typed is in the English language. Having not read the article, I assume that the attack is based on knowing the letter/digraph/trigraph/etc distributions of English.
Presumably, PINs are sufficiently random that it is impossible to determine which keys are which numbers simply by listening to a random sample of people entering their PINs, without knowing their numbers in the first place.
One thing to note is that you would need to train the machine with labeled input, not unlabeled input because PIN should not exhibit the same unrandomized frequency as the English language. PIN should be random...if not, then we already have a problem.
I wonder whether chorded keyboards would be a defense against this? You have fewer distinct sounds to recognize, but you also have to recognize the combinations in which the keys are pressed.
Thinking of it, I don't really understand why this is so interesting - wireless keyboards are abundant and it gives far better results to sniff that traffic and crack the encryption. So for the average identity theft and fraud, why go through all the trouble to get noisy sound recordings and analyse the clicks?
AFAIK these keyboards use some 128 channels, enough that there should be one free, to send the signals. While intended for short range communication, as always short range is not very short - I recall a story from Norway where the constructions allowed a signal to travel 300m and interfere with anothers identical keyboard.
Since most PIN keypads are in publicly accessible places, the attacker could themself surrepetitiously gather the data explicitly labeling each key.
A sound bug would probably be much more useful, if the accuracy of the decoded sound is good enough, than a visual/infrared bug, as placement requirements would be
much looser, since it can be stuck beneath the pad or under counters or ledges or underneath floor-standing ATMs.
In the way we rotate passwords every 30-90 days, perhaps we should rotate keyboards every few minutes. Though type is still harder to decipher than using NaturallySpeaking!
It seems to me that the way to thwart this attack is to use some kind of letter mapping for typing that produces English text from text with a uniform white distribution.
This may be achievable in practice by using a software program that allows some kind of condensed shorthand for each word, with auto-completion.
"Just put an analog of the keyboard on the screen and click on each letter with the mouse. It'll slow you down something awful, but every click will sound exactly the same ... ;-)"
That wouldn't work. Simply time the difference between the clicks, and if the mouse is noisy, you may be able to get directions from variations in the noise on the pad.
You might be able to get around that by cycling the "keyboard" layout regularly. What a usability nightmare.
> Programmers using any editor would type a
> different frequency of punctuation depending on the
> language they're using. If you know which language
> it is, that may not be as much of an issue, because
> you can just adjust the algorithm.
You wouldn't even need to adjust the algorithm, just use the same algorithm on a slightly different set of data.
You could establish a set of probable keypress frequencies by editor and probable frequencies by language (natural, programming, markup, what have you), and composite them. Run the algorithm on each pair simultaneously, and one of them is likely enough to find you an answer in a reasonable time, which is what it's all about. The more you know about the target, the more you can narrow this search space - if they don't speak Spanish, they're probably not typing in Spanish, primarily.
It actually sounds like a fun puzzle. Too bad really any actual application of it would likely be unethical.
GM: "Just put an analog of the keyboard on the screen and click on each letter with the mouse. It'll slow you down something awful, but every click will sound exactly the same ... ;-)"
me: I'll just bet mice make different types of noise as they roll in different directions. Certainly, if you have a ball and rollers, moving in different directions would use the different rollers to different extents.
An optical mouse should sound the same no matter what direction it's sliding, though.
I love it! Maybe you could post an audio blog based on your keyboard "acoustic emanations". Could be fun to try and decipher. Seems like the "touch-peak" of the enter and backspace might be the first keys to unlock the puzzle, since they would tell you when something was real or just random, but the report says they did not include special keys as part of their attack. Instead they say "a bit of human aid could be useful...assuming
this is possible, the classifier can be trained to recognize [special keys] accurately."
Jamie's post indicates it is not only possible, but a natural consequence for people who type in close proximity.
I left spaces out of my pgp passphrase because the spacebar sounds so different, I figured a listener could discover the length of every word and get a big headstart on decoding the entire phrase. Now this comes along. I may have to switch to a passphrase of just one repeated character to foil this one ;)
Roy Owens' suggestion of multiple keyboards in the room is flawed even if all of the keyboards are of the same make and model: the different locations of the keyboards would probably be sufficient to differentiate them acousically, especially if an array microphone was used. Similarly, spoofed keyclicks emitted from the computers speakers would be differentiated, the audio characteristics being so different, allowing that data to be filtered. Depending on the quality of the microphone, it might be possible to filter such noise by the nature of it's pulsed, digital waveform.
I suggest a system of introducing user entropy in every keypress. You could probably do it with some natty keyboard design, where different velocities of key depression produce wildly varying click pitches. Having a resonant case might further obfuscate the "signal". Cheap; no electronic modification needed. Just my two bobs' worth.
I have refered in the past to a password system that uses photographs.
Basically you enter your user name then the computer randomly selects nine photos from it's database.
You then press the number of the face that is know to you, or click on it with your mouse.
This is repeated untill the desired level of confidence is built up (after 5 screens you have about a 1:59000 chance of being right).
The advantage of the system is that the right photo in each screen can be in one of nine places, so anybody scanning your keyboard or mouse is going to get a different number each time. Also if each user has say 10 photos that are unique to them then you have a reasonable choice, so even shoulder surffing is not going to work very well.
I have seen a prototype of the system developed as a graduate project, however I have not seen a comercial version (which is a shame).
There is not much on the web about Rockex (there is a refrence in UK Cabinat Office briefing papers) I was told about it when diging into Tempest related stuff several years ago, however it has also since appeared in print,
The Secret Wireless War
The story of MI6 Communications 1939-1945
Geoffrey Pidgeon (Special Comms Units 1 & 11/12 and DWS from 1942-1947)
Published by UPSO (http://www.upso.co.uk)
British Security Co-ordination-
The Secret History of British Intelligence in the Americas, 1940-1945.
St.Errmins press, London 1998.
ISBN 0 316 64464 1.
The Rockex was an improvment on the RM26 and was invented by,
Lt. Col. Benjamin deForest Bayly
Originally from Moose Jaw, Saskatchewan he later became,
Professor (of Engineering) "Pat" Bayley at the University of Toronto.
Later still he became the first mayor of the Town of Ajax, he died aged 93 back in 97.
The Rockex was used untill the 1980's by the UK Foreign and Commonwealth Office (F&CO) for diplomatic and other high level communications. Apparently it was also used on the Washington-Moscow Hotline (not a phone as shown in the movies but a humble teleprinter link).
Oddly the Rockex appears to still be clasified in the UK despite a couple of examples in museums in Canada.
In one the full "secret" manuals are also on display.
I have also been told there is now one on display at the Royal Signals Museum in Dorset (UK).
Essentially the Rockex was a One Time Pad system, you feed in a tape with your traffic which was encrypted with the OTP tape, that was also automatically cut in half on some models to prevent reuse. As you can see from the photograph it was not a simple machine, like a teleprinter it was extreamly noisy (you would not want to sit in the same room for any length of time). Being mainly electro mechanical it had quite a large set of moving contacts that where either operated by cams or relays. It is quite easy to imagine that any one of these would not be perfectly alligned and would therefore slightly modify the line keying charecteristics in sympathy with either the traffic or OTP tapes.
Fascinating paper, thanks for the link.
Good pointers. They remind me of the "light" keyboard:
But I'm still stuck on the fact that the authors concede human-ear training is required to deal with any special characters including the backspace key.
the ultimate countermeasure is still 5-10 years away. during the dotcom heyday, i used to do a comic riff on how anybody could go into a venture capital office and come out with a huge check. the company in the riff was called "mindclick" and it had a technology for controlling a keyboard and mouse just by thinking about it or maybe wagging your finger just like ray walton used to do on the old "my favorite martian" tv show. damn, i must be getting old! anyway, this technology is eminently doable, biofeedback can help train people to light up different areas of their brains, and a special hat could pick up these signals and bluetooth them to your box.
Any solution that reduces the feedback I get (sound) from pressing keys also reduces usability....
Does anyone know of GPL'ed software that would allow you to use several mics to separate sound sources based on location? If so it could be combined with this research to allow someone to track separate logs of several people in a single room. It could also be used to defeat any background-noise-producing systems.
The day I have to have a 5 pin DIN connector fitted to the back of my head to use a computer is the day I take up watching weeds grow...:(
Seriously though BIO-Feedback systems certainly have progressed to the point where people can control wheel chairs. There was also a demonstration by a UK Uni (Reading I think) of a virtual keyboard, that you typed on in thin air, unlike the usuall light based system this was based around bio-feedback cuffs worn around the upper arm.
As a guess you probably will not see a 108 key keyboard virtulised in this way in the real world, it is more likley to be something similar to the five button MicroWriter (if people remember those).
How about random keyboard replacement. Have a large store of keyboards and someone to go swap them out every couple of hours.
This has been looked at before. I'm sure techniques have improved over the years but its not new. Those rocker style keypads make a lot of noise.
One of the tempest keyboards I've worked with used a metal comb below each key that blocked 7 light channels. Each key had different notches on the comb so it blocked different light channels. This was to stop RF emissions but the keys made no real sound when pressed. But what little noise it made at the rubber stop could be fixed by using hydraulics for the key dampening. Simply increase the pressure at the end of the stroke rather than having the key hit a stop.
you won't need to have hardware attached to your cranium, just a little dish on top of your hair/hairpiece which picks up the vibes from within and transmits them to your microprocessor so fast it will make the first time you ever saw broadband seem like a herd of turtles stampeding through chunky peanut butter.
It's cheap to prevent this attack: Use touch-pad, or optical one.
And, maybe there is an easier and percise way:
Use three mics instead one and apply triangulation instead machine learning.
Whilst looking for keyboards sounds that I can edit to hide my keyboard taps, I suddenly realised most people use a mouse most of the time, with short amounts of keyboard use intersprersed. If it takes 10 minutes of typing for 96% discovery then it might take several hours if not days to discover the full layout, if they can at all.
Granted there will be users who type more than I do, but most of my typing is really code for software in multiple languages.
"Just put an analog of the keyboard on the screen... every click will sound exactly the same ... ;-)"
Dasher is a reasonably fast text (for me a quarter typing speed) input program that requires barely any clicking at all:
you suggest the idea of using pictures of people who are well known to you as part of a pin.
What about the case of a bitter ex-girlfriend, who will know the identity of all the faces on the system. In this case, how is it possible to change your pin?!
what if my keyboard is jammed up my pooper and im using a garden rake to type?
The countermeasure is to put a loudspeaker into the keyboard work together a microphone, that can listen the sound of key stroke and make the same sound but in a reverse wave to block the real sound. This technic is used into airplanes to reduce de noise of jets, and in some headset to reduce de noise around user. May be, its will dificult te bad guys listen your keyboard, been necessary use a Boomer Microphone.
For those worried about an immediate, very cheap countermeasure, I find that the audible volume of my key clicks is greatly reduced by typing slower (to the extent that at 1 key/sec, in a quiet room, no sound is audible at all at 250 mm). This is obviously no good for regular typing but may help at least for password entry. It's necessary to consciously _release_ the key slower as well. You may need to turn down the key auto-repeat rate.
I also find the volume to be reduced -- though not as much -- by picking up the keyboard and holding it in one hand whilst typing. That probably also alters the case resonance, causing learning data accumulated during normal typing to be of less use in identifying passwords. Applying force on the case at various random points doesn't have any obvious effect on volume, but sometimes distinctly alters the sound -- this is something that could be easily altered from time to time to confuse a learning algorithm.
In a third experiment, tying a rubber band around a key made some keys totally silent, but had no effect on others.
These little experiments were done entirely by ear, as I don't have a microphone here; it would be interesting to see these tests, and other similar tests, done more objectively. It would also be interesting to see the effects of sound dampening foams applied inside the case, in contact with the case and chassis.
Nobody's mentioned it -- perhaps for good reason --, but what about graphic tablet-type handwriting recognition? I don't have PC experience with this, not even with handwriting recognition per se, but I've become quite adept with the gestural alphabet technique of the Palm OS Graffiti app -- probably at around half my typing speed. Aside from the radical changes to keyboard hardware design mentioned/proposed here, "graffiti"-type input seems a good alternative to me. It's very silent (or low-noise), so you "only" have to worry about the physical integrity of the equipment and the connection (and visual surveillance too, of course, but that's always the case).
Does anyone remember an article about similar technology for hard drives? Something to the effect that by listening to the sound of the drive, a person could capture the datastream? Have a cite? Or did I just halucinate this?
"Does anyone remember an article about similar technology for hard drives? Something to the effect that by listening to the sound of the drive, a person could capture the datastream? Have a cite? Or did I just halucinate this?"
I think you were hallucinating. The signal-to-noise ratio seems much too low for this to be possible.
But if there is research, I'd be very interested.
Hey - I am really glad to discover this. great job!
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.