Evidence that the NSA Is Storing Voice Content, Not Just Metadata
Interesting speculation that the NSA is storing everyone’s phone calls, and not just metadata. Definitely worth reading.
I expressed skepticism about this just a month ago. My assumption had always been that everyone’s compressed voice calls is just too much data to move around and store. Now, I don’t know.
There’s a bit of a conspiracy-theory air to all of this speculation, but underestimating what the NSA will do is a mistake. General Alexander has told members of Congress that they can record the contents of phone calls. And they have the technical capability.
Earlier reports have indicated that the NSA has the ability to record nearly all domestic and international phone calls — in case an analyst needed to access the recordings in the future. A Wired magazine article last year disclosed that the NSA has established “listening posts” that allow the agency to collect and sift through billions of phone calls through a massive new data center in Utah, “whether they originate within the country or overseas.” That includes not just metadata, but also the contents of the communications.
William Binney, a former NSA technical director who helped to modernize the agency’s worldwide eavesdropping network, told the Daily Caller this week that the NSA records the phone calls of 500,000 to 1 million people who are on its so-called target list, and perhaps even more. “They look through these phone numbers and they target those and that’s what they record,” Binney said.
Brewster Kahle, a computer engineer who founded the Internet Archive, has vast experience storing large amounts of data. He created a spreadsheet this week estimating that the cost to store all domestic phone calls a year in cloud storage for data-mining purposes would be about $27 million per year, not counting the cost of extra security for a top-secret program and security clearances for the people involved.
I believe that, to the extent that the NSA is analyzing and storing conversations, they’re doing speech-to-text as close to the source as possible and working with that. Even if you have to store the audio for conversations in foreign languages, or for snippets of conversations the conversion software is unsure of, it’s a lot fewer bits to move around and deal with.
And, by the way, I hate the term “metadata.” What’s wrong with “traffic analysis,” which is what we’ve always called that sort of thing?