The Future of Faking Audio and Video

This Verge article isn’t great, but we are certainly moving into a future where audio and video will be easy to fake, and easier to fake undetectably. This is going to make propaganda easier, with all of the ill effects we’ve already seen turned up to eleven.

I don’t have a good solution for this.

Tags: audio, forgery, propaganda, videos

Posted on December 22, 2016 at 3:35 PM • 40 Comments

Comments

Mace Moneta • December 22, 2016 3:51 PM

The death of evidence. Witness testimony is completely unreliable. Video and audio are (increasingly) easily faked. Biometrics can be planted and faked (fingerprints, hair, DNA).

On what basis will we convict people of crimes, beyond a reasonable doubt?

Pete Prunskunas • December 22, 2016 4:05 PM

Mix that with Adobe’s Project Voco, which edits voice recordings, and video becomes unreliable. Propaganda is only the first step. Just as unsophisticated law enforcement agencies fall for swatting calls, people will be arrested based on faux video. The future is Richard Jewell.

tz • December 22, 2016 4:06 PM

You can embed a digital signature in sub-audible signals as a watermark.

I think the larger problem will be when multiple media sources are simultaneously hacked, but quietly until something like an assassination, nuclear explosion, or something else is posted.

r • December 22, 2016 4:22 PM

I don’t know if I believe the whole ‘undetectably’ thing just yet, we have AI coming online that is capable of finding patterns humans can’t for the meantime.

Maybe ‘imperceptible’ to humans is more apt for the mean.

Haelwenn (lanodan) Monnier • December 22, 2016 4:23 PM

Well… if the original source is encrypted/signed I think it changes nothing.
But it’s not like we are used to signed data (outside software, emails and DRM).
Also testimony with a human saying what happened is still used… even if people can lie. So I think it changes nothing for jurisdictions.

r • December 22, 2016 4:32 PM

But!

In all fairness, do we value a computer’s opinion above and beyond our own even if it’s repeatable?

Angelica Zapata • December 22, 2016 4:32 PM

A certification program for journalists and photographers. Who then use digital signatures. Reputation and content integrity.

Dan • December 22, 2016 4:49 PM

Perhaps a return to film?
Sure, you can doctor and edit a single frame, but an entire newsreel?

It’d be great if folks required multiple sources to believe a story, but human nature being what it is, that just won’t ever happen. Thanks confirmation bias…

The news and media in general are already unreliable if for no other reason than the race for ratings and endemic incompetence. It’s already a chore to dig up the real news, the real stories, and the reliable evidence. A future where the audio and video is easily faked (especially at the point where that easy faking comes to anyone with a mid to high range home computer) becomes a future of endless photoshop battles; where reality is no longer relevant to what you see on TV or download from the internet. The only way to be sure will be to walk out your front door and look for yourself.

The truly terrifying future is when Virtual Reality becomes indistinguishable from the real deal. Then we’re all doomed. If we cannot distinguish what is the real world then what is “real”?

The issue comes down to trust. How can we trust what we’re seeing or hearing if those things can be so easily faked? Especially in an environment where everyone wants to put their own spin on things.

It’s why I suggest film. It seems a more trustworthy source. That said, perhaps in the future we’ll have to create “trust scores” or even “trust enforcement” of the media. An independent, government funded, non partisan organization that fact checks what goes on the air. If you want to be official “news” you have to submit to scrutiny. Mistakes can be forgiven. Purposeful falsehoods will result in fines, loss of airtime (forced to go off-air during prime time), or other penalties. If you want to hide behind “freedom of speech” that’s fine, but you don’t get to call yourself “News”. You can call yourself “information distribution” or some other euphimism, but you can’t be the “news”. In this way perhaps we could re-establish trust in the news, and once again have an effective fourth estate.

goodness I ramble. Sorry.

The Gost • December 22, 2016 5:00 PM

The truly terrifying future is when Virtual Reality becomes indistinguishable from the real deal. Then we’re all doomed. If we cannot distinguish what is the real world then what is “real”?

Some philosophers claim this is already true today.

J.Rex • December 22, 2016 6:00 PM

“I don’t have a good solution for this.”
You have no solution? Over 65 percent of Americans believe a guy named Noah built an Ark and Trump will be the guy that is going to sink it. I know were I’d start…

albert • December 22, 2016 6:10 PM

I’m not sure any of this qualities as ‘artificial intelligence’, but I guess they need to find -something- they can call AI. It’s just automated image editing; that’s all. Snake oil for the tech-obsessed.

Once you digitize anything, you place it at the mercy of the technology that created it.

[Think about this: a technology that can process the image in the camera, before it even gets to the NV memory. Uncle Harry was a mean old bastard, but he’s smiling in all the family pics and vids]

There can no longer be any real provenance for any digitized information, regardless of the character of the human doing the initial acquisition. Truth or falsity is a matter of trust. Folks with integrity are rare birds indeed; seldom found in government or industry.

The system has produced a population of cynics who refuse to trust ‘anything’ produced in media, by ‘anyone’, especially folks who are known to be grinders of various types of axes, all the MSM, corporations, most Gov’t entities.

This is dangerous for public institutions, especially governments (they may be starting to realize that now). When propaganda fails, governments can back off and open up, or circle the wagons and make a militocracy. Ignorance is no longer bliss, and Diogenes will need more than a lantern to find an honest man.

Thus ends my catechism.

. .. . .. — ….

EvilKiru • December 22, 2016 6:16 PM

@Dan: There is nothing inherent in the properties of film that makes film any more trustworthy than digital video. Faking film simply adds an extra digitizing step at the start and an extra output to film step at the end of the digital faking process.

TJ • December 22, 2016 6:17 PM

Raising discontent against [anything here]? Here is 4k color video of you molesting an eight year old to go with our circumstantial evidence..

Only second to my other favorite future-nuance: How will you say no when everything is secure and dictated purely by capital and/or majority-rule?

Tovaritch • December 22, 2016 8:41 PM

You read it here first, 50 years ago:
https://en.wikipedia.org/wiki/The_Moon_Is_a_Harsh_Mistress

TJ • December 22, 2016 9:56 PM

@Tovaritch: It’s in pretty much every future-fiction piece(especially cyberpunk but I favor Asimov).. Everything is secure and controlled and people are put in to some tight sandbox without an exit doing whatever some form of plutocracy wants..

I guess it’s kind of a good thing most security researchers and IT vendors are too lazy or capitalist to fix things like memory corruption and password policies..

John • December 22, 2016 10:46 PM

Researchers at Stanford animate the facial expressions of a target video by a source actor and re-render the manipulated output video in a seamless photo-realistic fashion. The authors show how disturbingly easy it is to take a surrogate actor and, in real time using everyday available tools, reenact their face and create the illusion that someone else is speaking.

http://www.zerohedge.com/news/2016-04-09/stunning-video-reveals-why-you-shouldnt-trust-anything-you-see-television
Paper: http://www.graphics.stanford.edu/~niessner/papers/2016/1facetoface/thies2016face.pdf

C3PO • December 23, 2016 1:44 AM

i know it’s easy to jump to doom and gloom outcomes
but how sure are you this can hold up in court?
a lot of things discussed on this site with legal implications can be countered when one knows something about law
the person making a claim has to prove their claim,
the defendant does not have to prove their defence.
so, for the burden of proof to rest with the prosecution they have to categorically prove it was NOT fraudulently obtained/faked . one only has to introduce reasonable doubt to get that piece of evidence thrown out.

with traffic camera images, people have stood in court and said ‘ you say that photo shows I was speeding. But you can prove that image was not tampered with before it was tendered for evidence? After all, all traffic cameras everywhere use MD5 and we know that was broken years ago..’ and

‘ disgruntled employees hacked the traffic camera images database. can you prove this image was not altered by an employee?’

et cetera

C3PO • December 23, 2016 1:47 AM

to add to the arguments debating the provenance of video, lets not forget that there is no proof Saddam or Bin Laden were either captured – OR that the people supposedly captured on film were those original individuals to begin with. Apparently people have identified about 4 different bin ladens on films in MSM over the years.
c’mon, this is the country that makes star wars movies. Anything is possible

Jens Oliver Meiert • December 23, 2016 2:18 AM

I believe having this brought up by experts is helpful and important, for it seems we judge and sometimes convict people based on what becomes easier and easier to fake and forge. There lies great irony ahead when we gather more and more data, but can use less and less of it because we cannot trust it.

(Cf., I’m feeling free, https://meiert.com/en/blog/20141029/electronic-evidence/.)

JG4 • December 23, 2016 6:25 AM

JG4 • August 21, 2016 2:07 PM
https://www.schneier.com/blog/archives/2016/08/friday_squid_bl_540.html#c6732186
…
I don’t think that I mentioned this threat model before, and I don’t recall seeing it articulated. “They” have sufficient information to reliably duplicate your voice, speech patterns, email “return address,” writing style and vocabulary. I haven’t yet seen (or don’t recall seeing) the argument made that one of the most compelling reasons for using robust encryption is to prevent people/groups/agencies from impersonating you, in particular to disrupt your network. The threat model was at least implied by the post a couple/few months ago of a newsclip where Stanford published video of real-time transfer of facial expressions to a computer-model animated mannekin.

Peter • December 23, 2016 9:02 AM

Source and reputation.

Any information is only as valuable as the source standing behind it. We will no longer be able to validate information on its own only based on the source.

AJWM • December 23, 2016 11:35 AM

@Dan

Perhaps a return to film?
Sure, you can doctor and edit a single frame, but an entire newsreel?

Look up “kinescope”. Now imagine a kinescopic recording of an ultra-high-def display, with the focus tweaked just enough to remove any obvious pixelation. (Won’t take much, UHD resolution is already roughly at the typical grain size for 35mm film.)

And this is ignoring completely the roughly hundred years of motion-picture industry film (vs digital) tricks.

albert • December 23, 2016 12:14 PM

@EvilKiru, @Dan,
Film has a random grain structure*; digital does not. There will always be a matrix of perfect squares in the digital image. That’s why computer analysis of digital images is so effective. It’s quite simple to deduce a film copy from the original film, unlike digital copies. The digital image would need extremely high resolution; enough to mask out the matrix. -Starting- with a film image is the way to go. If you don’t have the original film, then it’s a crap shoot, even if you could determine that it was made from a digital ‘original’.

This is an era where unarmed folks are shot in the back by police (‘street justice’), recorded on video, and convictions can’t be obtained. I tend to wonder why there’s an argument for ‘true evidence’ at all.

The digital giveth, and the digital taketh away.

*in movie film, the grain structure is different in -every- frame.

. .. . .. — ….

Lawrence D’Oliveiro • December 23, 2016 12:49 PM

One approach is to collect additional circumstantial data (e.g. time and GPS coordinates) along with the video and audio in order to strengthen its use as evidence. This is the approach taken by the CameraV app from the Guardian Project.

A Nonny Bunny • December 23, 2016 3:21 PM

@C3PO

one only has to introduce reasonable doubt to get that piece of evidence thrown out.

If it’s that simple, then why is eye-witness testimony even still admissible in court? Research consistently shows the mind easily creates false memories and changes real ones to fit new evidence.

with traffic camera images, people have stood in court and said ‘ you say that photo shows I was speeding. But you can prove that image was not tampered with before it was tendered for evidence? After all, all traffic cameras everywhere use MD5 and we know that was broken years ago..’

And did saying that work?
I’d hazard to guess not, or not often. Because it’s more reasonable to assume the photo is real than that someone bothered to fake it, because while it’s possible, it’s not trivial and who has motivation to do so?

It’s like if they find you hunched over a bloody corpse while holding the knife that killed her. You can argue that it’s possible she was killed by an assailant that left no trace and you only pulled out the knife when you found her. But honestly, you’re screwed. Even if it’s true, probably.

Charlie Todd • December 23, 2016 4:09 PM

Ditto on watermarks. Each camera firmware should tamper with raw pixels to encode manufacturer and serial number of the focal plane using watermarks or steganography. Add relative time into it and you can detect cuts or insertions.
Sounds like a great opportunity for a research paper that challenges my math beyond my current skill set.

albert • December 23, 2016 7:31 PM

@Charlie Todd, et al,

Reminds me of the yellow steganographic ‘watermarks’ printed on every sheet in laser color printers (so that’s why my yellow toner is always low). But again , it’s effective because it’s outside the digital realm. You could (theoretically) alter the watermark of your printer to match that of someone else, but if you altered it randomly, you’d have the only printer that didn’t match the manufacturers database…. 🙂

Within the digital realm, I don’t see how we can have total security. Even reasonable security seems unlikely. Perhaps temporary reasonable security is the best we can do.

The legal world is fraught with uncertainty. Over reliance on digital technology isn’t going to help.

. .. . .. — ….

HashTag Don't Believe Everything You Read Or See • December 23, 2016 8:13 PM

I wonder if this is related to the page 24 ‘magic’ techniques of the Snowden JTRIG document? Do YOU believe in magic?

George Dalton • December 23, 2016 9:55 PM

Perhaps one countermeasure might be to put hashes of videos into a blockchain. That won’t guarantee the video is original, but it will at least establish a sort of “chain of custody” where we can say that on [date] this version of the video existed. Maybe we can get to the point where the camera itself handles the work of hashing and submitting to the blockchain in near realtime.

That would at least cast doubt on faked videos because they would have to explain why the blockchain timestamp is so much later than the date of the event.

Drone • December 24, 2016 2:04 AM

What difference does it make if you can “undetectably” fake video and/or audio? By the time the fake content is “out there”, it’s too late! The attention-span of our brainwashed self-absorbed population today is too short to pay attention to whether something is eventually proven false.

Drone • December 24, 2016 2:33 AM

Here are a couple of current articles related to the subject of fake content.

(Note: I don’t normally cite NPR because they are often biased politically, but these articles seem fairly straight to me.)

We Tracked Down A Fake-News Creator In The Suburbs. Here’s What We Learned

http://www.npr.org/sections/alltechconsidered/2016/11/23/503146770/npr-finds-the-head-of-a-covert-fake-news-operation-in-the-suburbs

Excerpting:

Q. What can be done about fake news?

A. Some of this has to fall on the readers themselves. The consumers of content have to be better at identifying this stuff. We have a whole nation of media-illiterate people. Really, there needs to be something done.

On why his fake content is so popular:

“The people wanted to hear this,” he says.

Fake Or Real? How To Self-Check The News And Get The Facts

http://www.npr.org/sections/alltechconsidered/2016/12/05/503581220/fake-or-real-how-to-self-check-the-news-and-get-the-facts

Curious • December 24, 2016 2:58 AM

Slightly off topic perhaps, though I think if people could simply be smarter about things, then at least they’d be better off anyway, fake content in news or not. Because, together with any audio and video, there will be proposed narrative(s) that ought to be properly understood and clarified, and so being dumb or gullible about things paradoxically make you believe unreal things to be real, and real things to be unreal.

Some bullet points for improving ones language:

• The use of language often end up having more than one meaning, try avoid that.

• Try avoid being vague by being intentionally and willfully poignant, but not for being ambiguous like in pt 1.

• Be direct, not indirect, be specific and use whole sentences.

• No such thing as a priori knowledge. And can you think a thought? I don’t think so, and neither should you. Cue post modern philosophy.

• Representatives for organizations tend to want to deceive by coming up with words and phrases that act as a substitute, for words they dislike, this is called ironic distancing.

• Circular reasoning is bad. Relying on your ignorance, only for you to remain ignorant.

• “Begging the question” doesn’t really mean that a question ought to be raised. It is a fallacy with language and meaning, in which it is said, that one assume the conclusion of an argument. I.e making an assumption which leads to the conclusion in an argument, like circular reasoning. A deceit of sorts.

• Individual human beings ca be said to have needs, things do not. Also nobody “needs to shut up”.

• A quotation and a paraphrase are two different things, don’t confuse them for one another.
• ‘Problems’ are “problems”, specific things are not problems, neither are specific events as such, all things are just references, not problems.

• The word “absolutely” is fairly useless, relying on your ignorance and idiocy, and will only make you seem like a liar.

• Use a dictionary not only for spelling correctly, but also for learning more about words, even knowing about a words etymology will enrich your understanding of language I’d argue.

• A lot of English words apparently originate from Greek and Latin. Often found in the prefix, midfix and suffix of words (e.g words starting with inter-, an-, in-, com-, sub- as prefix). Such doesn’t necessarily mean anything in particular, but might offer a clue to the general meaning of any word.

• Learn the difference between simply making a point and simply providing an explanation. Being poignant and being clear are two different motivations.

• Honesty is not enough (as if your word was enough), and is meaningless if you are not being sincere (making yourself relevant). And ofc, being sincere is not possible if you are not being honest.

• Argumentation theory is ugly. Making an argument isn’t really about providing you with facts and reason, unfortunately, it is simply about making you believe in things which may or may not be well described by people you don’t even know. How would you know something to be reasonable and relevant, but not meaningless drivel. You will have to decide for yourself unfortunately.

Curious • December 24, 2016 3:03 AM

To add to what I wrote:

As for pt 10, I guess I should have been more clear and instead had written the following:

“The word “absolutely” is fairly useless, relying on your ignorance and idiocy, and will only make you seem like a POTENTIAL liar.” (My emphasis here.)

Trung Doan • December 24, 2016 4:20 AM

This tech may help activists in Vietnam. Currently, the voice of an activist, say, interviewing workers about the state-run union’s collusion with employers can be identified and used by secret police.

albert • December 24, 2016 4:15 PM

@Curious,
OT, indeed, but well worth a read.
. .. . .. — ….

Jeroen • December 24, 2016 5:55 PM

“One source ain’t a source”

Marcelo Menegali • December 25, 2016 10:23 AM

There is no way to avoid this other than tackling the hard task of educating the population about machine learning technology (which makes this possible) and its power.

Drone • December 27, 2016 3:11 AM

@Jeroen, You said: “One source ain’t a source”.

A single source may certainly be a definitive source…

Take for example a paper where the Author cites a publication containing an accepted mathematical proof, or one that objectively details a historical fact. In these cases citing a second source is not only a waste of time – it may be difficult, if not impossible.

MikeA • December 27, 2016 11:22 AM

Then there’s the “CSI Effect”. When the majority of the populace have gotten their “science education” (if any) from CSI or McGyver, it can be darn difficult to convince a jury of things that are, in fact, true/false in the real world, because they saw a false/true example on TV. On TV, DNA evidence is 100% complete and accurate (and gotten in a few minutes), and one can always “enhance” a 320×240 CCTV image enough to get a perfect rendering of the suspect’s face, and the license plate on the middle car of 10 in the picture.

And to stray into another peeve, 911 (999) always answers promptly, and you can always get a clear signal, even in the desert around Las Vegas (any of them)

Plus the hired-gun “expert witnesses” who will say (from my personal experience) just about anything for pay. How does a juror tell which one is lying most?

Trumpomatica (a Finnish cello metal band) • December 28, 2016 2:39 PM

BTW this technology can also be used for things like:

black-flag propaganda (useful excuses for blackmailing and starting wars)
persistent presidency (e.g. the US President can continue stay in power after assassination)

Schneier on Security

The Future of Faking Audio and Video

Comments

Leave a comment Cancel reply