Jamming Speech with Recorded Speech

This is cool:

The idea is simple. Psychologists have known for some years that it is almost impossible to speak when your words are replayed to you with a delay of a fraction of a second.

Kurihara and Tsukada have simply built a handheld device consisting of a microphone and a speaker that does just that: it records a person’s voice and replays it to them with a delay of about 0.2 seconds. The microphone and speaker are directional so the device can be aimed at a speaker from a distance, like a gun.

In tests, Kurihara and Tsukada say their speech jamming gun works well: “The system can disturb remote people’s speech without any physical discomfort.”

Tags: jamming

Posted on March 12, 2012 at 6:35 AM • 39 Comments

Comments

aikimark • March 12, 2012 7:15 AM

coming soon, to a campaign stop near you.

Y.T. • March 12, 2012 7:21 AM

Finally, it will be possible to truly enforce free speech zones!

Unrestricted free speech is so 19th century!

grumpy • March 12, 2012 7:29 AM

Shiny! Want! In local trains there are designated “quiet zones” where conversations are unwanted but many passengers, often adults who should know better, ignore that because their call is of course very important. I would be sooo happy to educate them with this tool. My favourite alternatives (a stern look and/or a two-by-four) now look so outdated. One must move with the times. I wonder if Amazon carry these…

Ghetto • March 12, 2012 7:34 AM

I could beat it. I’ve done plenty of public speaking, sometimes with a bad echo. It’s disorienting a little, but with some practice, it’s master able. Oddly the most often I get that kind of feedback is on conference calls, typically with some cell phone echo or such.

barfa • March 12, 2012 7:43 AM

Isn’t this thwarthed easily by earplugs? Either foam plugs, but that will make it harder to modulate the volume of your own voice, or by in-ear headphones of the type used by musicians, where the artist gets to hear their own voice or instrument controlled by the mixer.

Brett • March 12, 2012 7:47 AM

Where can I get one? As long as it doesnt hurt the wife . . . .

Dave • March 12, 2012 8:31 AM

I can foresee many practical applications – for example surgically impant the device into Justin Bieber’s larynx. Very promising.

Scott H • March 12, 2012 8:32 AM

@Ghetto: Indeed. I worked in a broadcast radio station in the late 70s and early 80s. As a running bet among the DJs (this was before radio started to think that calling yourself a personality would help you develop one), we’d set up a tape delay in the headphones. First one to stumble on the air bought the next round after work.

Occasionally, we’d do that to the new guys without warning. They’d just take the headphones off if they were smart.

Eric • March 12, 2012 8:35 AM

It is just like announcing at a football game when the speakers are at the end of the field and you are in the middle.

Dan • March 12, 2012 8:51 AM

This sounds like great fun to take to a political rally.

McCoy Pauley • March 12, 2012 9:06 AM

When are these scientists going to invent something really useful? I’d like to see a device that causes politicians’ heads to explode every time they say something like, “We’re all in this together.”

Clive Robinson • March 12, 2012 9:21 AM

Hmm,

Psychologists have known for some years that it is almost impossible to speak when your words are replayed to you with a delay of a fraction of a second

Man psychologists are so behind the times, telecoms engineers have known this since before the invention of the telephone (think telegraph).

The telephone solution is “issolators”, “echo cancellation” and “local side tone”.

And contary to what the psychologists think it is not “almost impossible to speak when your words are replayed to you with a delay of a fraction of a second” it’s easily do abble and the result is usually you just slow your speaking down. Which iss exactly what people on long distance phone calls used to do when echo cancellation broke down.

The nasty trick if you realy want to do it is to remove vowel sounds and replace them with a consonant sound from earlier speach or swap consonant sounds arround. This hits the brain in a different place and can cause seizures in some people.

Mark • March 12, 2012 9:26 AM

It’s nearly as good as a Point of View ray!

Now all we need is for someone to market a pocket version that clips to the back of a smart phone. I wonder if the original works something like that?

Adam • March 12, 2012 9:47 AM

The bluetooth in my car has this unfortunate habit. Caller can hear themselves speak with a small delay making it almost impossible to hold a conversation.

Josh • March 12, 2012 9:50 AM

I strongly disagree that it is impossible to speak when hearing oneself on a short delay. I am a police officer and, when broadcasting using my car’s radio, I will hear myself with just such a delay via an ear-piece from my mobile radio. It was distracting at first, but now I essentially ignore it or actually listen to it to make sure my transmission was sent out clearly.

NobodySpecial • March 12, 2012 9:53 AM

@aikimark + @dan
Presumably it only works if the speaker is listening to what they say – so politicians would be immune.

Madincroydon • March 12, 2012 9:53 AM

I’m sure they’ll be an ap for it. Perhaps ‘Little Sir(i) Echo’.

Mikael • March 12, 2012 9:55 AM

Hearing yourself is uncomfortable at first, but having used voicecomms for games for years, I frequently hear myself via other people’s speakers when they’re transmitting at the same time as I’m talking.

It’s still annoying, and I swear at them, but I’ve trained myself to keep talking. But, yes, it took a while to get used to.

Clive Robinson • March 12, 2012 10:11 AM

@ McCoy Pauley,

I’d like to see a device that causes politicians’ heads to explode every time they say something…

Depends on what you mean “explode” how about a leathal stroke?

There are various ways to use sound to incapacitate people temporarily or permanently and induce unconsciousness or death and some experimental weapons developed under “nonlethal weapons” programs.

Whilst the human (or animal) can only hear a limited frequency range it can be effected by both subsonic (infrasound) and ultrasonic energy, and importantly one can act as a carrier to the other.

Now the important thing to remember is “resonant frequency” and what happens to a resonator when you hit it at a critical frequency.

All parts of your body have their own resonant frequencies. As some “rock concert” attendees know a good heavy base line can cure constipation as the base line around 80-150Hz causes parts of the gut to resonate. A silly but fun trick is to carefully fill a ballon with a diet (soda) fizzy drink suspend it infront of a “base bin” and cause it to vibrate the gas comes out of the fluid the ballon expands and if you get it all right explodes creating a mess just like a Mint-Mentos/diet cola fountain it can be improved by adding certain things like rock salt.

The human heart has a resonant frequency but thankfully it tends to be fairly well absorbed by the fat lungs and other structures around it.

Which brings up the question of using off resonant carriers that the surounding structures are transparent to to carry the resonanting frequency in.

This can be done with ultrasonics where by you select two ultrasonic frequencies that surounding structures are transparent to, that have a difference frequency at the resonant frequency of the organ you wish to shake up. Providing you can find a way for the two carriers to “interfear” or “mix” with eachother on or in a substance with a nonlinear response then the difference frequency will be generated at the point where the two carriers meet.

Again thankfully the body does not have much susceptability to this except at places like joints etc.

However you also need to consider not just direct mechanical effects but neurological effects. There are several structures in the body where interferance effects can be used to excite nerve endings, some very low frequencies (6-8hz) are known to have undesirable effects in the brain, the most known is flashing lights triggering siezures etc.

As far as I’m aware all mammals are suceptable to this and if the carrier frequency is selected with care and good directivity and the two sources are spaced correctly spacialy then you could pick out a single individual in a croud from a sizeable distance.

Dena Shunra • March 12, 2012 10:13 AM

As a simultaneous interpreter , regularly working with one language going in my ears, the other out my mouth, I can confirm that getting over that sort of thing may be one of the hardest parts of the training.
It ends up feeling like disengaging one’s brain – and for me, and all other interpreters I’ve talked to about this, it means that we can interpret for quite a while and not remember a word we said.
It’s also an exhausting effort, which is why most conferences work with booth-duos, where each interpreter takes a turn doing 15-30 minute chunks while the other recovers.

janwo • March 12, 2012 10:30 AM

@ Josh: The timing is important. The (un)desired effect only occurs with a delay of ca. 200 milliseconds. 100 milliseconds more or less, and the effect is gone.

Mark • March 12, 2012 10:42 AM

@Clive
“The nasty trick if you realy want to do it is to remove vowel sounds and replace them with a consonant sound from earlier speach or swap consonant sounds arround. This hits the brain in a different place and can cause seizures in some people.”

Can you point me to any articles describing this?

Clive Robinson • March 12, 2012 11:28 AM

@ Mark,

Can you point me to any articles describing this?

Not sure what’s around it’s fairly recent (since 2001) it appears to be a subset of either “reading epilepsy” or “musicogenic epilepsy”.

Reading epilepsy is by general consensus a “too broad term” and people have been using various triggers to induce episodes to issolate what is verbal and non verbal or thinking induced.

I’ll have a hunt around and see what I can find that’s not behind pay walls.

But in the meantime have a read up on the various triggers and how they work.

Okian Warrior • March 12, 2012 11:32 AM

some very low frequencies (6-8hz) are
known to have undesirable effects in the
brain, the most known is flashing lights
triggering siezures etc.

This happens when the incident stimulus is in phase with one of the frequencies the brain uses. It’s the neural version of a resonant amplifier – pushing on the swingset at the right frequency makes it swing higher and higher.

The brain has evolved a mechanism to detect and avoid resonant amplification, so that when repetitive stimulus is presented the internal frequency changes to compensate. It’s the same feedback mechanism that prevents epileptic seizures in most people.

So yes, flashing lights and other repeated stimulus can make some people go into seizures, and it can give you a headache because you’re forcing the brain to work at a different frequency, but most people will still be able to function.

The “electrosleep” system exploited this mechanism. Presenting stimulus while measuring the patient’s EEG signal allows you to follow and compensate for the change in brain frequency. These systems will allow you to induce various effects in one patient.

James Sutherland • March 12, 2012 12:16 PM

“The nasty trick if you realy want to do it is to remove vowel sounds and replace them with a consonant sound from earlier speach or swap consonant sounds arround. This hits the brain in a different place and can cause seizures in some people.”

Can I be the only one to think immediately of applications against telemarketers? (The ones exploiting the TPS “market research” loophole to do “research” amounting to “would you be interested in our product?” frankly deserve this…)

Aaron Binns • March 12, 2012 3:10 PM

Some discussion on the same at Language Log: http://languagelog.ldc.upenn.edu/nll/?p=3814

Brandon • March 12, 2012 4:01 PM

@McCoy Pauley

When are these scientists going to invent something really useful? I’d like to see a device that causes politicians’ heads to explode every time they say something like, “We’re all in this together.”

Or, “Too big to fail …”

kashmarek • March 12, 2012 4:44 PM

I want something that works in a similar fashion when the words “enhance your user experience” come up.

Phone slave • March 12, 2012 7:05 PM

As someone who answers the incoming phone for a business, I know all about this. Some phone systems (especially speaker phones) will have an echo. It’s taken quite a bit of time but I’ve learned to partially tune this sort of interference out.

I’ve failed the Turing test because of how I answer the phone: people think I’m a machine and wait quietly for a person to pick up. I hang up.

Chris J • March 12, 2012 7:25 PM

This happen rather frequently in live news. There is a just under a second of encode/decode and transmission time during a satellite live shot. The person in the field is supposed to get a ‘mix-minus’ which is the show audio with their voice ‘minus’d out. If they get a full-mix they hear their voice delayed just under a second, you can usually tell because they stammer a bit and then pull out their earpiece. I have come across a few field reporters who are adept at ignoring this distraction.

ike • March 12, 2012 10:44 PM

Is there an app for that?

Figureitout • March 13, 2012 12:22 AM

@ kashmarek

Haha yeah, movie theaters may be well on their way to having automatic-scanning/detecting voice jammers…all you jackasses who answer phones in movies better think twice lol

This is a little off topic because these are “weapons” and not just “jammers” per say. I’m sure plenty of people here know about the LRAD, and some of the weapons Clive was describing that can ruin your day by making a boom-boom in your pants.

At first, you don’t really think of “sound” as something that can physically harm you, yeah it can hurt your ears or make you deaf, but rupture your lungs? Kill you?

Correct me if I’m wrong, but isn’t one of the most destructive “things” in a nuclear blast, the extremely loud sound waves created by it?

Also, there appears to be some work on a basketball-sized “diffraction-less acoustic bullet” in the U.S. and Russia that can be turned from non-lethal to lethal levels. There was still some skepticism with this research though. This article seems pretty informative: http://scienceandglobalsecurity.org/archive/sgs09altmann.pdf

You know, sure there may be some significant scientific discovery that may come about from this research..but don’t we already have enough ways to kill and in effect silence someone…

Vles • March 13, 2012 2:50 AM

I bet you can somehow use this phenomenon (delay) in captcha’s to distinguish between computers and humans…

Jonadab • March 13, 2012 7:06 AM

It’s disorienting a little, but with
some practice, it’s master able.

I can confirm this. It takes effort and time to train yourself to be able to continue speaking under these kinds of conditions, but you can train yourself to the point where you can keep going just as if it weren’t happening at all. It’s most difficult (and therefore requires the most practice) if you need to compose what you are saying on the fly. Getting to the point where you can deliver a memorized spiel through the echo is somewhat easier (but still requires some practice, at least for most people).

Mark • March 13, 2012 7:08 AM

@Clive

I didn’t find any references to what you were talking about specifically. But if reading or even thinking can trigger seizures in people with epilepsy it does sound plausible.

Jonadab • March 13, 2012 7:13 AM

I’d like to see a device that causes
politicians’ heads to explode every
time they say something like,
“We’re all in this together.”

“I feel your pain.”

Unfortunately, designing such a device would be what computer scientists call “AI-complete”, i.e., nobody has any idea how to design software smart enough to figure it out. The problem is that you have to understand what the words being uttered actually mean (or, indeed, whether they mean anything at all), which ultimately requires an understanding of the world around you that nobody knows how to design into software.

Also, even if such a device were made, using it would almost certainly be illegal, due to some bizarre feature of the law wherein politicians are granted rights just as if they were human — a terrible miscarriage of justice, to be sure, but that’s the kind of world we live in.

LinkTheValiant • March 13, 2012 8:41 AM

My personal assumption would be that this effect is partially caused by our tendency not to speak when someone else is speaking. Hearing oneself on a slight delay would trigger one’s reaction of “shut up when someone’s talking to you”. (It might also explain why politicians would be unbothered by it.)

As Mr. Robinson points out, the simplest way for a non-trained person to defeat this is speak more slowly. (This is something all good public speakers have to learn anyway.)

me • March 15, 2012 12:13 AM

I’ve met a couple of people that such devices would have no success whatsoever because it’s not so much they love to HEAR themselves talk, they more just love to know they are in the act of talking and nothing and no one is going to stop them.

Sometimes they come in the form of in-laws and other times long winded, distant relatives.Often they are bosses or marketers. I’m sure you can think of a few more and once you’ve met them, you know there’s no tool on the face of the Earth that can save you once they’ve sucked you up into the vortex that is continually circling around their heads.

Jan • March 15, 2012 2:20 PM

I hacked together a mini android app that demonstrates the effect.

Schneier on Security

Jamming Speech with Recorded Speech

Comments

Leave a comment Cancel reply