Headphones as Microphones

Surprising no one who has been following this sort of thing, headphones can be used as microphones.

Posted on November 23, 2016 at 6:56 AM • 40 Comments

Comments

WooNovember 23, 2016 8:28 AM

I never liked this "SmartPlug" thing in the Realtek (and many other) chipsets.. mostly for the trouble it causes with the attached audio gear.. and now I've got one more reason to hate it for security concerns as well.
(though, I didn't find any details how they get around the fact that the Realtek Mixer thing will show a notification window when port assignments change, or that the "normal" microphone port will stop working because only one port can ever be configured as input channel, and other side effects of tinkering with that config..)

Clive RobinsonNovember 23, 2016 8:52 AM

@ Bruce,

Surprising no one who has been following this sort of thing, headphones can be used as microphones.

Or anyone who has read this blog, I think I've mentioned the "transducer" issue of motors=generators speakers=microphonrs etc a half dozen times or so.

Also back in the early days of BadBIOS, RobertT and myself discused a number of issues about AC97 and RealTek devices as they were more standard on PCs than even CPU's...

They are and probably will remain the number one target of choice for the discerning snooper as just about "every PC has one" even those with other Pro Audio devices. So if you are hiding your snoopware in the guts of a hard disk controler faking yourself as a Boot ROM for the BIOS to load every time (something Microsoft still support even after Lenovo got caught red handed) then those oh so standard RealTek devices are the things to subvert...

Oh the Israeli Uni which keeps turning out these hacks... Well so far all of the issues they exploit have been discussed years befor on this blog. You should feel proud of the free "Research Ideas" you have in effect given them ;-)

hawkNovember 23, 2016 9:00 AM

When I was 5 years old, it intrigued me when I spoke into the speaker on the TV, whether or not someone could hear me at the other end. This led to peering into the back of the TV through holes and cracks, to see what we're all those glowing tubes of glass.

Bill BrownNovember 23, 2016 9:41 AM

C an this be done if the headphones are being used for playing audio continuously?

TatütataNovember 23, 2016 10:17 AM

What's new under the sun?

When I was in grade school about 40 years ago, classrooms were equipped with an absolute novelty: intercoms. The field stations used the loudspeaker both as a sound reproducing device and a microphone, and had a switch for T/R selection and calling, but otherwise didn't contain any electronics. It was common knowledge (or rumored) that the principal listened in on the classes, but I kind of remember that there was a countermeasure.

I doubt that the headphone vulnerability is much of a threat. A quick fix could be a piece of software running in the background which periodically reconfigures the sound chip as an output device, and plays some kind of low-level noise. I don't think you could both record and playback. The disappearance of the noise would serve as a mine canary.

It the sound chip's configuration registers are readable, or the API supplies that information, a visual alarm could also be given.

Clive RobinsonNovember 23, 2016 10:36 AM

@ Bill Brown,

Can this be done if the headphones are being used for playing audio continuously?

That is a compound question which would give a false impression if answered with a simple answer.

So no simple answer short, but a longer one :-S

The ear buds/phones or speaker are in general of the "moving coil" type, not all are though so this answer is about moving coil devices. They consist of a magnet in which a coil sits and this is attached to a diaphragm of some kind. This is the same as a DC motor/generator as the coil moves through the magnetic field a current is generated, likewise if a current is put through the coil the diaphragm will move. The current in the coils is equal to the displacment of the coil. Thus is the sum of the current put into the coil and the current generated by the movment of the coil. As the current generated by the coil is in the opposit direction to the current that generates the movment the coil appears to have a resistance greater than the actual coil impeadence. This generated current therefore has a voltage associated with it and is often called "the back electromotive force" or "back EMF".

As I've mentioned before if you have the moving coil transducer connected to a two port to one port hybrid circuit often called a circulator you can drive it with music from one port, and the back EMF will be on the other port.

So if the issolation is good enough then yes you can play music through the ear pieces and pick up other sounds on the other port. The Russian's used this technique back in the cold war when people were under the mistaken idea that having the radio on in the car alowed you to have a private conversation the KGB could not listen into...

Thus having shown it is possible to do it with the ear buds, the next question is can the AC97 chip set do this hybrid function, and the answer is a probable yes. Whilst the chip may not have a hardware hybrid, the software can do the same thing, if you can configure the audio port to be both input and output, which I have a feeling you might be able to do because of the "crossbar" like nature of the chip internals (and if the actual chip logic alows it). The only way to know for certain is to get down and dirty at the chip command level and experiment.

tombNovember 23, 2016 11:15 AM

The last 3 pairs of headphones I bought advertised an internal microphone as a selling point. Not to say it's the same thing described here. If you're not looking for that feature, odds are you've forgotten you already have it without any clever driver hacks.

Gunter KönigsmannNovember 23, 2016 11:18 AM

What I wonder is: currently my PC's audio input seems to go to 21khz and then abruptly cuts off. But it is a delta sigma fax, meaning: it consists of a circuit that generates an endless bit stream of perhaps a few megabits per second that tries to mimics the analogue input signal as closely as it can: 010101... means 0.5, 011011011... Means 2/3, 01101101011011011011... Means a little bit below 2/3... ...and this circuit is followed by a circuit that averages up the bitstream to the sample values the sound card outputs. If this filter would be configurable to let pass audio signals up to 40kHz... ...and if your laptop speakers like mine are small enough to be most efficient in the ultrasonic range... ... wouldn't that sound like a potential side-channel? A friend of mine received weather cards using a 33MHz PC and his sound card but in 1997 so the computing power needed for this should be available by now.

albertNovember 23, 2016 12:04 PM

@Clive,
Couldn't you subtract the signal from the back EMF, and process the remainder? To add insult to injury, make the sound card DSP do it.

IIRC, there was Windowstm software that allowed you to re-patch audio connections in the chip graphically, by drag 'n drop. Actually quite cool (and quite complicated). You don't 're-purpose' the output, you simply patch another 'input' line to the speaker/phone output line.

As mentioned in the paper (arxiv.org/abs/1611.07350)
, a simple buffer amplifier on the headphone side would mitigate the threat. Internal speakers would be harder to deal with, and since it appears that no one uses desktops anymore, xPad users are, as always, effed.

. .. . .. --- ....

They Are In The WallsNovember 23, 2016 12:43 PM

Speakers can be used as microphones too. Passive speakers work well as microphones, I'm sure the "they" have figured out how to trick the audio interface in to switching the "SPEAKER" output to be a "MIC" input. Probably some type of impedence test.

mesrikNovember 23, 2016 2:24 PM

Hello,

Basically any surface* which vibrates with changes as surrounding air pressure changes and from which it can be gleaned (sampled at high enough frequencies) for sound is a good candidate being used as a microphone. That includes many parts of the computer, phone or any active equipment. Some good candidates would be chassis (motion sensor/gyro), possibly display, touchpad and desktops having graphical drawing pad. All have hard surfaces and definitely those receive some vibrations from sounds around where they are located. It's not that those surfaces do vibrate, but is there some way sample those vibrations somehow is the challenge.

*) remember that room listening device where laser reflection from window can be used to listen room and The Thing KGB used to listen US consulate in Moscow.


That leads me wondering from the article and some above comments, that some find it surprising that tiny speakers can be used as mics. Speakers have basically same mechanical and electrical construction as dynamic mics and small earpieces are very similar with a thin rigid film and wire, wire coil and a tiny magnet.

However it is bit surprising that the used audio chip, which is supposedly to drive output, can be reversed and instead used for input too. Genuinely, I don't know what use the designers of that chip have been tasked or being otherwise prepared it to be used.

:-) riku

mesrikNovember 23, 2016 2:33 PM

Hello,

Sorry to comment myself right away, but it occurred to me right after pressing send. Could it be that chip was meant and designed also being used for reading input for possibility using it for noise cancelling setting?

Or it being some kind of general purpose DA/AD device?

Othewise I can't understand why output device would be able to read input if the device was meant to drive output only?

:-) riku

Clive RobinsonNovember 23, 2016 3:30 PM

@ Albert,

Couldn't you subtract the signal from the back EMF, and process the remainder? To add insult to injury, make the sound card DSP do it.

As "Mr Punch" says "That's the way you do it boys and girls"

It's what I was implying when I said,

    Whilst the chip may not have a hardware hybrid, the software can do the same thing...

One of the troubles with trying to make my posts shorter these days --due to past readers comments-- is that I have to leave out some of what I might think are not key details. Mind you I guess Bruce could collate them with along with all the other interesting bits and put them in a book called "Security tales from the Schneier Blog" or some such. From other comments I think it might be popular.

Clive RobinsonNovember 23, 2016 3:39 PM

@ mesrik,

Othewise I can't understand why output device would be able to read input if the device was meant to drive output only?

For the almost identical task that such "feedback" is used with motors to make the output load independent. In this case rather than controling speed it would control the linearity and frequency response of the signal.

prolixNovember 23, 2016 4:16 PM

@Clive Robinson

trying to make my posts shorter these days --due to past readers comments

My, oh my! You're trolling us, aren't you Clive? I doubt there are any regulars here who believe such self-editing is called for.

WaelNovember 23, 2016 4:55 PM

That mentality saddens me. Why would researchers boast about their ability to use technology to "eavesdrop on someone" by using the headphone as a microphone? It's not impressive, not in the least. Their research time would be better spent on positive things, but it's not their fault. It's the fault of society as a whole; we glorify the hacker (or cracker if you prefer) type. Spies are viewed as "heroes", if they are on the right side of the fence, of course.

On top of that, they have conferences such as blackhat to showcase their methods of stealing, denying service, and spying on others!!! Why is that ok? What if Pimps, whores, convicted felons, looters, drug dealers, rapists, and murderers decided to convene once a year to share their adventures and advances in techniques? Now thaaaat would rock. What if they published their methods in a journal? Lol.

Want to impress me? Build a proof of concept project that intercepts a phone user's voice not by measuring and sampling vibrations, but by analyzing the vapor output from his mouth and condensation patterns on the capacitive screen to reconstruct the conversation. Oh, the receiving part? Use AI or ML/DL or whatever other buzz-word that appeals to you to assemble the entire two way conversation. Now thaaaat would be impressive.

AlvaNovember 23, 2016 5:03 PM

Surprising no one who has been following this sort of thing, headphones can be used as microphones.
Or anyone who has read this blog, I think I've mentioned the "transducer" issue of motors=generators speakers=microphonrs etc a half dozen times or so.
That's really just half of what's happening here. The other half—that your audio output plug can be switched to an input with nothing but software—is more surprising. Not shocking to anyone who's read the muxing and GPIO sections of an SoC datasheet, but not immediately obvious. I just watched Snowden's Vice video, where he explains how to desolder a phone's cameras and microphones, and he doesn't mention anything about speakers. (And then there are the really crazy things that people could try using to capture voice data, like the capacitors that someone mentioned and maybe accelerometers.)

NoOneNovember 23, 2016 5:11 PM

Othewise I can't understand why output device would be able to read input if the device was meant to drive output only?
These chips can flexibly reroute inputs and outputs and have line-in and microphone in.

albertNovember 23, 2016 5:23 PM

@Clive,
I apologize. I guess I'd better read your posts more carefully:)

"... I have to leave out some of what I might think are not key details..."

I thought about something similar when I posted my comment. I used the term 'patch' (which in early analog recording, meant rerouting the signal via a patch bay, derived from the telephone switchboards). I don't know how common this term is today, but I thought about explaining it, then I thought about folks telling me I think they're idiots, then I thought, WTH, I'll live with the fallout:)

For God's sake, don't edit your posts. The complainers can go back to Twitter, or whatever the online version of Cliffs Notes or Readers Digest is.
.

@mezrik,
Modern audio chips have lots of capabilities, which for most folks, are never used. Even if the internal signal lines aren't 'wired' to anything outside, the 'crossbar' capabilities Clive mentioned can reconfigure the signal paths -inside- the chip.
. .. . .. --- ....

Morgado JsNovember 23, 2016 6:23 PM

as stated before this is not news, it's being used from a few years now.
Worst is that the same principle (via headphones) is used to "inject" malware into the devices

LOVINT BurialNovember 23, 2016 8:31 PM

The other half—that your audio output plug can be switched to an input with nothing but software—is more surprising. Not shocking to anyone who's read the muxing and GPIO sections of an SoC datasheet, but not immediately obvious. I just watched Snowden's Vice video, where he explains how to desolder a phone's cameras and microphones, and he doesn't mention anything about speakers. (And then there are the really crazy things that people could try using to capture voice data, like the capacitors that someone mentioned and maybe accelerometers.)

Snowden probably didn't talk much about his 'privacy blanket' I'm guessing. Security folks have to walk a fine line trying to be educational without being binned under 'tinfoil hatter'.

If I had to imagine justifiable Snowden-logic for such a decision it would be- "If I can get people to understand the relevance of doing the most obvious couple things, then they'll figure the rest out on their own". Seriously, if that vice-video you mentioned led to devices that simplified that task to the point they were accessible to the masses, one could presume the additional speaker issue would be a followup noop effectively.

Clive RobinsonNovember 24, 2016 2:44 AM

@ Wael,

... but by analyzing the vapor output from his mouth and condensation patterns on the capacitive screen to reconstruct the conversation.

And the "channel bandwidth" of that would be about 0.25Hz thus the information bandwidth would be?...

WaelNovember 24, 2016 2:58 AM

@Clive Robinson,

And the "channel bandwidth" of that would be about 0.25Hz

How so? But at any rate (pun intended), use the Shanonn-Haryley theorem:
C = B log2 (1 + S/N) ;)

Wesley ParishNovember 24, 2016 3:03 AM

One: it's not news. A pair of tweeters, a couple of midrange and a bass speaker may not be optimized for picking up sound, but they'll do it anyway.

Two: it wouldn't surprise me in the least to find that some AD circuitry could be re-used as DA. Far from optimized, of course.

Their malware uses a little-known feature of RealTek audio codec chips to silently “retask” the computer’s output channel as an input channel, allowing the malware to record audio even when the headphones remain connected into an output-only jack and don’t even have a microphone channel on their plug.

Is it that different from using an input signal on a radio to filter the input signal? As in the superheterodyne receivers, which contained a radio signal amplifier to filter out the extraneous signal.

Clive RobinsonNovember 24, 2016 4:37 AM

@ Alva, LoveInt...,

I just watched Snowden's Vice video, where he explains how to desolder a phone's cameras and microphones, and he doesn't mention anything about speakers.

There is an issue that Snowden and others who have been given a "security clearance" have to be mindful of, which is not "revealing secrets" of a technical nature thst they have been "read into", even though they are obvious from the laws of physics etc.

The classic example is the subset of "Methods and sources" that TEMPEST / EmSec is. As I've repeatedly said it is all about "Energy and Bandwidth", from that and the open knowledge of the laws of physics you can work out it's entirety as a logical consequence. However when you go on one of the courses they tell you some of those consequences but rate them as secret or above, even though they might have been described in the public domain already. Thus you get the "Catch-22" problem of not being able to discuss an open subject without the fear of getting dragged into what is in effect a kangaroo secret court and having your future destroyed. As we know the IC is extrodinarily vengful against the lower ranks for even an accidental infraction of the labyrinthine rules some of which are secret themselves (the Patriot Act has the same hidden rules for ordinary non IC individuals like university researchers). That is the IC is run like a venal despotic tyranny to mainly hide their myriad of failings, corruption, nepotism, infighting and breaking of common laws, as much if not more so than a religious cult or crime family.

Like a street gang, cult or crime family, you are "in for life", getting out with a life is difficult for those deemed to have transgressed.

From Snowden's point of view he has not revealed "methods and sources" just "proof of malfeasance" and fundemental breaches of US law by the IC against it's citizens.

Thus when talking about security precautions, he in effect "self censors" to that which is "already widely known" so that the sort of bombastic charges of revealing "methods and sources" that IC shills might cry out can be seen by all for what they are, nonsence propaganda, to divert attention from the malfeasance those in public office commit or allow to be committed by turnibg a blind eye etc.

This is in part to protect himself from precipitous behaviour by elements of the IC or their wannabies, and in part because he does want to be free of the vengeful tyranny, even if he never does return to US Soil.

If you want confirmation of the mentality of the IC tyranny look at the way the US patent system works in respect of what the existing Military Industrial Complex and Intelligence Community wants, it might shock you. It's why I advise people that want patents in these "national security problem areas" to go to somewhere like Switzerland first, their attitude is much more "business is business" in these areas, and once the process is started the secrecy strictures are gone.

Oh and one final piece of advice, keep out of the IC and MIC they want rather more than just your soul for a few crumbs in return, with the promise of a long future pension that can be snatched away if you don't genuflect as they think you should, or the seniors just take a dislike to you for any reason. You'd be safer signing a pact with the devil in blood...

Clive RobinsonNovember 24, 2016 6:19 AM

@ Wael,

How so?

Breath on a mirror of a temprature sufficient to "show your breath" and wait to see how long it takes to appear then evaporate.

The issue is in part due to how much water vapour in your breath needs to come in contact with a touch screen to cause the capacitive sensor to provide sufficient detectability and thus also how long it takes to evaporate back below that threshold.

Obviously the smaller the sensor the higher the bandwidth, but with the area of the average touch screen and the materials it's made from 0.25Hz would be a bit on the generous side as far as I can see (on a cold morning like this morning in London ;-)

Christos DimitrakakisNovember 24, 2016 6:37 AM

True. But although it's possible to pick up a signal, the software would need to be able to access the headphone as a microphone, somehow.
And wouldn't the headphones be behind a D/A converter anyway? The output can't drive the input of the D/A converter.

WaelNovember 24, 2016 6:39 AM

@Clive Robinson,

on a cold morning like this morning in London

I see! London fog! But one can still measure the exhaled humidity content before it condenses on the screen. And there a reason I said it would be an impressive feat ;)

Mirror, mirror on the wall. Who's the fairest of them all?

•Crack•

albertNovember 24, 2016 2:09 PM

@Christos Dimitrakakis,

Typically, 'phone outputs have buffer amplifiers after the D/A converter to boost the signal, impedance conversion, etc. The connection processing system resides -inside- the chip, so it can easily re-route the interconnections -internally-. Crossbar switches (in general) allow any single line to be connected to any other single line or multiple lines. 'Inputs' and 'outputs" need not be distinguished.

. .. . .. --- ....

P2PNovember 25, 2016 12:37 AM

Domestic spying laws are only vaguely related to "headphones as microphones" but there is some relevance. I think that this is important enough to everyone on earth to not count as spam even if copy&pasted everywhere where infosec is discussed.

"I guess the better way to disguise track is to tunnel it through HTTPS (make it look legit) and within the HTTPS tunnel, you do your own E2EE protocols."
This is why everyone with a computer or smartphone should connect to Tor (it has obfs4 which is indistinguishable from HTTPS) Even its default looks like HTTPS, albeit with some unusual options in the initial handshake.
I2P and Freenet are similar to Tor with 1 tradeof; it's less centralized (no reliance on authorities), but more vulnerable to sybil attacks, and everyone's a (fairly safe, non-exit) relay.

Running a (nonexit) relay for one or both of those will help liberty and freedom for everyone everywhere. Make sure your laptop or phone is plugged in if you do, and make sure you set it not to use more bandwidth than you can afford. In totalitarian dictatorships you might be persecuted for running relays!

The more people in these networks the better for freedom and democracy, even if you aren't a relay.

In addition to or instead of the above, please join mesh networks such as Serval and/or Rumble. These will keep working even after a massive terrorist attack that takes out the whole Internet. If enough people run these, our critical infrastructure becomes more resilient to single points of failure.

Republican/Dempcratic/Labour/Tory makes no difference, Bush attacked the Internet for 8 years and so did Obama.

Impeach trump and vote for a libertarian replacement.
Abraham Lincoln was libertarian, as were George Washington and Benjamin Franklin.
The draft of the bill of rights was submitted anonymously.
Protect the fourth amendment, the first amendment, and human dignity in general; call your state's federal House representatives' Washington DC offices and tell them you support H.R. 6341 "Review the Rule Act”.
The illegal changes to rule 41 that were snuck in using a protocol designed for small, minor errata will actually make sweeping changes and ruin a great cointry.

Please post this everywhere and send it to everyone who you have any faith in as a human being. This world can still be saved. It's not too late.

LeilaNovember 26, 2016 6:11 AM

My PC is asking every time what kind of device I've plugged in. Aux, Mic or headphones. If I plug headphones and check as a mic it works like charm. Also it is funny to cover your camera since your screen shows what are you working at.

mesrikNovember 26, 2016 7:47 AM

@Leila

Well if you just think how scantily clad, bad hair day etc people surf while at home and how common it's to pick ones nose among other things when people think they are not being observed. So perhaps it would make you reconsider the situation if it was you that was hit and blackmailed by someone asking money not to disclose those photos or videos at some popular site and make you embarrassed or wouldn't it be?

Regarding your privacy, sometimes it doesn't just matter what's on your screen, it's also how you appear in front of that screen that interests those who try to compromise your computer.

mesrikNovember 26, 2016 8:24 AM

@Clive & all that commented me

I didn't have time to come back and comment your comments during the week, but now with bit more time I gave it second thought.

First, I'm quite convinced that there's a reason for that feature in that Realtek chip. I didn't happen just by accident or mistake. That's because if same thing would have been implemented using discrete components reversing output to input almost certainly wasn't possible. Without knowing the chip internals any further than the story described there are few possible reasons like some above comments suggests, but what are plausibility of those possibilities?

Feedback and detecting impedance, well that wouldn't be possible simultaneously when output is being driven. Audio line is bit like a RF antenna, if you try to monitor (listen) same band while you transmit (drive output) you will blow in / block your input (receiver). That's the reason why two channels are commonly being used. AFAIK same applies to audio frequencies (driving speaker) too, even those are much lower Hz having meaningful imput while outupt is driven is not possible. If impedance adjusting/fitting is being done that's probably done shortly before output is switched on.

I've been thinking what other reasons would be plausible too, like someone suggested that it's a more general use chip, a kind of GPIO, that's quite plausible if that makes it either cheaper to design, produce, manufacture or those extra features makes it more popular for it's intended audience and simultaneously manufacturing cost does not rise too much to be profitable. In short I'm suggesting making more general purpose chip would be better for maker, keep in stock and also for chip user (device designer) than making a specific but tiny bit cheaper to manufacture chip which if it was needed to be change on already made pc-board afterwards would become much more expensive operation.

Yes I've been quite long time ago building electronics as a hobby and almost once one started using micro chips, that's what they were called back then, instead of bare discrete components, it was clear that those chips did have more features/circuits which were commonly left unused, but which were added during manufacturing because it was cheaper to make one bit more versatile chip than a specific for each task. This thus speaks strongly for the logic that the chip feature reversing output to input may very well be due this kind of side effect and nothing more sinister reason. So that's what I think may be the most plausible reason for being able to switch output to input on that chip.

:-) riku

MarkHNovember 27, 2016 1:47 PM

In my long-ago youth (maybe Clive will appreciate this), I was very interested in sound recording and reproduction.

Well aware of reciprocity in "dynamic" speakers, I tried using a speaker element of 12 or 15 cm as a microphone out of curiosity. I knew that it was too large to be sensitive to really high frequencies, and certainly wasn't optimized as a microphone, so I didn't expect much.

I was startled to discover what an outstanding microphone it was: extremely sensitive (giving much more output than my purpose-built microphones, due to its large piston area), and excellent sound quality. Its high-frequency losses would have been in the range of a few kHz, so ordinary household sounds (and particularly the human voice) came through with perfect clarity.

I understood at that moment that ordinary speakers were the spy's dream microphone.

Clive RobinsonNovember 27, 2016 4:55 PM

@ MarkH,

maybe Clive will appreciate this

+2

I just wish others would grok the bidirectional nature of many transducers. I know it's difficult to initially get your head around, but when you do the world looks different and "thinking hinky" starts to feel good B-)

Ask @Figureitout about "his LED moment" ;-)

Leave a comment

Allowed HTML: <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre>

Photo of Bruce Schneier by Per Ervland.

Schneier on Security is a personal website. Opinions expressed are not necessarily those of IBM Resilient.