Schneier on Security
A blog covering security and security technology.
« Drone Flights Over the US |
| QR Code Scams »
December 12, 2012
Detecting Edited Audio
Interesting development in forensic analysis:
Comparing the unique pattern of the frequencies on an audio recording with a database that has been logging these changes for 24 hours a day, 365 days a year provides a digital watermark: a date and time stamp on the recording.
Philip Harrison, from JP French Associates, another forensic audio laboratory that has been logging the hum for several years, says: "Even if [the hum] is picked up at a very low level that you cannot hear, we can extract this information."
It is a technique known as Electric Network Frequency (ENF) analysis, and it is helping forensic scientists to separate genuine, unedited recordings from those that have been tampered with.
Dr Harrison said: "We can extract [the hum] and compare it with the database - if it is a continuous recording, it will all match up nicely.
"If we've got some breaks in the recording, if it's been stopped and started, the profiles won't match or there will be a section missing. Or if it has come from two different recordings looking as if it is one, we'll have two different profiles within that one recording."
Posted on December 12, 2012 at 12:59 PM
• 27 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Most video cameras and audio-recorders have some kind of 50/60hz hum filter to prevent problems. That would seemingly destroy this source of information, but probably there are harmonics of those frequencies everywhere, and any synchronous motors that can be heard in the background will also produce a harmonic.
The really elegant aspect of this is that there are millions of latent recordings of the power signal that span as far back as there are recordings of high enough quality. That means that you could potentially take news footage from the 70's or even 80's and correlate the hum with other videos from that time.
How long until YouTube builds a map of these signatures and automatically time/date stamps and roughly locates the video?
There is plenty of room for logic bombs in this analysis metod. Im certain that I can edit audio and pass through your screening process. Im sure you get a lot of false positives as well and that some edited audio still doesn't flag. I hope more audiophiles get a hold of this info for debate because its comical.
Surely the hum varies in different locations, based on distance to various generating stations, and perhaps also transformers. The variations in the hum would be a consequence of sub-stations turning on/off and changing their output levels, plus electrical storms in the area, noise from the local smelter starting up a new run, ...
Do they have a log of the hum characteristics for every neighbourhood around the world? "That video which is claimed to have been recorded in Timbuktu does not match the hum database for London." "Well, it wouldn't, would it?"
This sems to be based on contiguous recording to obtain a large enough sample to work with. Do recording studios record 'whole' songs without any type of breaks or change by the audio engineers for levels of sound in the recording? I seem to remember reading about this sort of thing with regard to cell phone (umm...personal tracking device) conversations to detect the environment surrounding the device. Really? That is a lot of high quality data to record for what seems to be very low value in the long run. Typically, such data is eventually lost, destroyed, or just forgotten. Keeping fingerprints, DNA, and criminal records with accuracy, integrity and availability seems almost too hard already.
By the way, how about a text message that is converted via text-to-speech? What kind of "audio-print" will they get from that? A different one for each playback I would guess.
If you substituted one frequency hum on top of another that would mask the former surely. Or notch filter the audio. It's the same reason people still get busted later for crimes, because they took their phone with them.
A good signal to noise ratio on a recording can make separating this background very difficult. Multi-tracking includes several conflicting signals. Minimal editing to clean up flaws by removing pauses or a speaker saying umm may give a false reading that the editing was done to alter the content.
For those who want to experiment there are various ways of generating an AC signal from another AC signal where the frequencies are not the same.
First of all there is the good old fashioned rotary generator. An AC motor connected to the input supply drives a shaft connected to a generator which is wound to produce a different frequency. You used to find small versions of these used for powering up avioniics equipment running at 400Hz by ground/maintainence personnel form either 50Hz (Europe) or 60Hz (US) mains supply.
However the variation in frequency of the AC supply would unless fllywheels etc were used be seen on the ouput generator.
A second way to do it is to convert AC to DC and back to AC again there are various ways to do this and is quite often done on in-line UPS systems to prevent phase jumps when switching from mains fed to battery fed which can cause a few nasy problems that SMPSUS realy don't like.
To get the efficiency and purity required you can use a design I came up with quite a few years ago which is to use Walsh Sequencies via class D drivers onto a multiple primary transsformer, the output is a quit pure sinewave with next to no harmonic content untill after the 16th harmonic which can be efficiently removed with a low poer transformer.
A third way to do it is with one of a variety of Variiable Frequency Transformers.
The easiest of which to understand is the phase quadrature system. To all intents and purposes a frequency difference is the equivalent of a continuously changing phase difference.
The easiest way to efficiently produce a phase difference is to add to sine waves in phase quadriture (ie one sin one cos) depending on the amplitude ratio between the two the resulting addition has a phase difference somewhere between them.
In a three phase system you can have two types of transformer a parallel or a shunt. A parallel transform has the same output phase as the input (ie either 0 or 180 degrees) a shunt transformer produces a phase quadriture (90 or -90/270 degree) output. If you take the (variable tap) output of a shunt transformer and feed it into a parallel transform wired to add the wave forms then by changing the output taps you can change the output phase of the system.
Now if you design the output of the shunt transformer to be continously variable (like that of a Variac) and drive the taps with a variable speed motor you can continuously change the amplitude at the shunt transformer output, which will cause a continuous phase change at the ouput of the parallel transformer which equates to a different frequency.
As the transformers can be made 99.9% efficient or better then the overall system will be over 99.8% efficient in power conversion. Which is kind of handy when you are talking of systems working in the 25MW and up power range as 50KW or less of heat is still quite a bit toget rid of (about what you would use to heat a six and a half thousand square foot home or office in a NYC winter.
Now obviously you could use one of the above methods to power the lights and equipment in a home garrage recording studio, or house or appartment block if you so desired and taylor the frequency changes to match that of some point in the past you have recorded somewhere (you could easily do it with a gumstick computer and an SD card for storage taking a reading every half second or so (would only need 0.165Mbyte/day storage or around 17years/GByte).
Obviously you would have to plan ahead, but for entirely different reasons I have been recording the UK mains frequency delta for several years.
"Obviously you would have to plan ahead, but for entirely different reasons I have been recording the UK mains frequency delta for several years."
Okay, Clive wins this thread! On to the next one!
Anyone with access to this database or any equivalent database can prove a fake recording is fake.
And anyone with access to such a database will be able to "watermark" a fake recording, so it passes the test.
Since the database is generated from ambient EMF, anyone with an interest in the subject can make their own database.
So anyone who wants to can produce fake recordings which will pass this test, and be "proved" authentic.
This is useful to catch any edited audio document which forgot to adjust this
watermark, or to automatically add tags to innocent audio documents.
This cannot be used to authenticate authentic recordings.
My remarks are also valid for face recognition software working on images.
Nice bookmarks, but they lack any title or keyword for each link. I wonder if someone has a public-accessible searchable archive of all linked pages.
Would this work on a recording that has been compressed using a lossy codec like mp3?
@Guy: There are almost always harmonics of 50/60 Hz, yes. Indeed, these are quite a plague in the pro audio world, and especially guitarists. If it was just pure 50/60 Hz, I doubt anyone would care that much, but 100/120 and 200/240 are quite audible and annoying in recordings.
Unfortunately, recordings from previous decades are not eligible for this technique unless somebody recorded the frequency fluctuations in the power-lines to compare them with.
@Glitch Logic: I agree, there apear to be some logic bombs, but not what you seem to be alluding to (unless I misunderstand you). The technique is not as simple as proving that the underlying 60 Hz Hum is contiguous; rather it maps fluctuations in those hums to know fluctuations that happened in the past. The biggest logic bomb I see here is that a hum could be added after edited to where no hum existed before (or to mask an existing one). It seems this technique is best suited to proving a recording is inauthentic or has been altered rather than to proving it's valid.
@John Mcdonald: I think you should read the cited article. It explains that the variations they are measuring are the same (at least for the purposes of this technique) within any given power grid. A separate reference recording would be required for each grid.
@OnTheWaterfront: In principle it strikes me that this technique could work on a lossy format, but whether or not it does would depend on many factors, including the quality of the original recording and the quality of the lossy recording.
This sounds a bit like work I wrote about for Scientific American a few years ago: a team of mathematicians and sound engineers were able to reconstruct some live recordings of Woody Guthrie from old wire. One of the keys, ISTR, was using background electronic hum to enable them to correct the distortion in the aging, fragile, damaged physical medium. (Ah, yes: piece is at http://www.scientificamerican.com/article.cfm?...
Given that large-scale grids are ostensibly on their way out and "smart power" is on its way in, this seems like a technique that will be limited in time and space, rather like Sherlock Holmes's encyclopedic catalogs of tobacco ash. I also wonder just how universal the fluctuations are, given that power systems are designed to damp them. Especially in the close neighborhood of generators and large loads you should have (and I mean "should" in the normative sense if the power grid is properly designed) fluctuations that do not quite match those of the grid as a whole.
@tha - December 13, 2012 4:13 AM
Nice bookmarks, but they lack any title or keyword for each link. I wonder if someone has a public-accessible searchable archive of all linked pages.
Indeed. At the top of the page in the CHANGELOG section you'll note:
4. Removed most titles/descriptions of links to reduce clutter & ease use. While this may make it more difficult to find exactly what you're after, it's cleaner, leaner, and easier to navigate through. This is especially true for the guys maintaining this document. There is some labeling/titles here and there, but most of the heavily verbose titles/descriptions of various entries have been removed.
Now scroll down to the bottom and note the previous versions with URLs of the document. In the version just prior to 6000, there were many titles to the links with descriptive information.
(In version 6000) This information was stripped for easier parsing. The idea is: with regular use titles aren't required, the user will become familiar enough with the sites in the document they won't need titles. Finally, if you were to browse the document with titles for each link it becomes a cluttered mess. It's also easier to maintain without the titles which get in the way.
In the next version, the 'Random Links' area will be closed and each link within it moved to newly titled and properly sorted sections. The next version will contain many, many more sections and a ton of new links.
If you wish to discuss the document with the authors, post in the Open Discussion area with the document title here:
The site requires client use of Tor to access.
To underscore what some other commenters have said or implied ... and to contradict a few others:
1) This technique could be used to make an edited recording appear authentic (by removing any original hum, and adding a signal generated from the mains frequency database).
1a) This technique could be used to make an otherwise un-edited (that is, real-time) recording appear to have been recorded at some different date/time, by a variant of the above procedure.
2) This technique could be used to make an authentic recording appear to have been tampered with, by a variant of the above procedure using database segments from several disjoint times.
3) Using a recording (or post-processing) technique that introduces random "speed" variations could readily produce a hum signal that would not correspond to the mains frequency history. For example, an ancient battery-powered reel-to-reel :) Almost all recording today is done by digital systems with excellent time stability.
In other words, it is not a reliable system for either proving or disproving authenticity, unless the provenance of the recorded data is such that the required tampering were impossible -- but in that case, who needs the hum test?
It's easy to see the dangers in relying on a technique for forensic purposes, that can be gamed in so many ways.
@MarkH: "2) This technique could be used to make an authentic recording appear to have been tampered with, by a variant of the above procedure using database segments from several disjoint times."
Impossible if the original recording remains available.
I don't think it would be useful to put back all titles. But a separate public searchable database would be good.
I tried the requests > and > on search engines, exploiting the feature "These terms only appear in links pointing to this page", but it did not work. Viduthalai is a word appearing in https://en.wikipedia.org/wiki/Category:Paramilitary_organizations listed in these bookmarks.
PS: https://en.wikipedia.org/wiki/Category:Christianity is also on that list ??
The two request were filtered out in my messages. Please find them in next two lines:
"HUGE Security Resource" Viduthalai
@satellite bingo: "The idea is: with regular use titles aren't required, the user will become familiar enough with the sites in the document they won't need titles."
I usually bookmark a site once I become familiar to it. But I don't expect to become familiar with the 2210 links.
Setting up a search engine, for example www.seeks.fr, based on the 2210 links, is a possible solution. A hierarchy à la yahoo may also work, but seems more complex and difficult to achieve to get a satisfactory granularity.
This particular application appears to only be applicable to the UK. From the BBC article:
In the UK, because one national grid supplies the country with electricity, the fluctuations in frequency are the same the country over. So it does not matter if the recording has been made in Aberdeen or Southampton, the comparison will work.
Elsewhere around the world, it is slightly more complicated because some countries can have two or more grids. But in these cases, all it takes is for the hum to be continuously logged on each power system and for a recording to be compared against each of them.
The USA has three primary power grids with DC interconnects, so one can't apply this technique just anywhere in the USA unless the data has been recorded in all three sections.
And any location that is "off-grid" won't be traceable at all. So just do your recording at a solar-powered site, and point it out for plausible deniability.
Here's a site that tracks the Continental 50Hz grid with real-time measurement: http://www.mainsfrequency.com/
Here's a site that tracks the US and other grids: http://fnetpublic.utk.edu/index.html Fascinating analysis, which I think indicates that such a "database" may not be usable forensically in the USA because of the large grid and local variations. For example, see the live frequency gradient map at the above link and across the grid following perturbations -- like a gigawatt nuclear power plant going off-line. See http://fnetpublic.utk.edu/sample_events.html for some interesting examples.
If it comes from the same recording, I can guarantee with 100% confidence that I could fool these guys because I edit down to the Hz, and even match the DC Offsets so that they cannot detect single sample voltage disparity. :)
The only way you would be able to tell the difference would be by variations in unavoidable natural factors--voice fatigue, inconsistency in inflection, or speech tempo.
As long as the background noise is consistent, you can forget about detecting the splice points in a 44.1 kHz recording.
If it comes from a different recording, it's easy to tell the difference anyway. Spectral analysis would tell me in like 3 seconds.
I see there is a lot of interest in understanding how one could remove/mask/forge an ENF timestamp. Researchers at UMD presented their work in this space at this year's CCS:
I specifically asked whether their detection approach would be resistant to lossy compression (e.g. mp3), and the author's answer was hopeful i.e. they believe that their approach would still work when the audio in question is recorded using a lossy encoder.
Isn't this how the Watergate tapes were determined to have been tampered with?
We have a tape that was presented to us by a "police officer" indicating that my boyfriend was trying to set me up or was an informant for something I never did. I'm trying to figure out if this tape is legit or if it was edited at all? Is there any advice you can give me on figuring this out? Also, my aunt has made copies of this tape......can the copies be checked for alterations??
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.