Is the U.S. Government Recording and Saving All Domestic Telephone Calls?

I have no idea if “former counterterrorism agent for the FBI” Tom Clemente knows what he’s talking about, but that’s certainly what he implies here:

More recently, two sources familiar with the investigation told CNN that Russell had spoken with Tamerlan after his picture appeared on national television April 18.

What exactly the two said remains under investigation, the sources said.

Investigators may be able to recover the conversation, said Tom Clemente, a former counterterrorism agent for the FBI.

“We certainly have ways in national security investigations to find out exactly what was said in that conversation,” he told CNN’s Erin Burnett on Monday, adding that “all of that stuff is being captured as we speak whether we know it or like it or not.”

“It’s not necessarily something that the FBI is going to want to present in court, but it may help lead the investigation and/or lead to questioning of her,” he said.

I’m very skeptical about Clemente’s comments. He left the FBI shortly after 9/11, and he didn’t have any special security clearances. My guess is that he is speaking more about what the NSA and FBI could potentially do, and not about what they are doing right now. And I don’t believe that the NSA could save every domestic phone call, not at this time. Possibly after the Utah data center is finished, but not now. They could be saving the all the metadata now, but I’m skeptical about that too.

Other commentary.

EDITED TO ADD (5/7): Interesting comments. I think it’s worth going through the math. There are two possible ways to do this. The first is to collect, compress, transport, and store. The second is to collect, convert to text, transport, and store. So, what data rates, processing requirements, and storage sizes are we talking about?

Posted on May 7, 2013 at 12:57 PM84 Comments


Dilbert May 7, 2013 1:06 PM

Bruce, I know Tim personally and I believe he knows exactly what he’s talking about. He shared these same views with me roughly 10-12 years ago. This is likely an extension of the old Echelon program. I doubt they’re storing audio; more likely using voice-recognition and dumping this all to text. Sure, they could be doing the old keyword flags but I doubt that (too much noise). I expect it’s all dumped into massive databases for after-the-fact investigation.

For more on the capabilities, just do a search on “echelon semantic forests”

Miramon May 7, 2013 1:29 PM

I’ve seen a great many technologically and physically impossible claims of this sort over the years from journalists and scholars who really should know better.

This one is one the edge of possibility, anyway. Anyone know how many ordinary landline calls are still based on SONET, IN, and stuff like that? If it’s still most of them, then I think it would be a very difficult feat.

But you could at least imagine some cabal of carrier CEOs assembling in a secure undisclosed location shortly after 9/11 to come up with some kind of massive investment in switching, signalling, and standards to send everything over to some NSA datacenter.

Now that I say it out loud (or in print) it seems even less likely, though. Too many technical people would be involved — hundreds at least — and it wouldn’t have stayed a secret.

Adrian May 7, 2013 1:32 PM

Remember whenever DHS would change the alert color they’d talk about the “level of chatter”? I always assumed that meant they were keyword-scanning vast amounts of communication and watching for frequency and patterns of those keywords.

I, too, am skeptical that any agency is (currently) keeping audio recordings or even auto-generated text transcripts of all phone calls, but I wouldn’t be surprised if a majority of calls are scanned in real time to count instances of keywords. What else would those tap rooms in the telco offices be used for?

As for phone call metadata (who called who, when, and how long), don’t the telcos have all that for billing? Is it hard to believe that the feds have access to that?

bcs May 7, 2013 2:02 PM

It should be possible to back-of-the-envelope the amount of bandwidth (and fiber optic cable) needed to so it. As for running voice-to-text near the call and archiving that, you still need to get the audio to a compute cluster of some kind to do the data-crunching. I wonder which would be harder to hide, a bunch of small clusters of a bunch of fiber trunks?

OTOH, how much cheaper would it be to do just selected (e.g. international) calls?

amanfromMars May 7, 2013 2:06 PM

My guess is that he is speaking more about what the NSA and FBI could do, and not about what they are doing.

Hi, Bruce,

Did you mean to say the probably more truthful …. My guess is that he is speaking more about what the NSA and FBI are trying to do …. with us fully understanding why you chose not to [He who pays the piper, calls the tune and all that sort of stuff]

Every man and his dog know that if something can/could be done, will it be done and in as quick a time frame as is possible. It is a weakness in humans who think themselves smart, and is something which can be easily exploited, …. and therefore is being exploited.

Indeed, here is a tale of something similar being practised by a western ally and ECHELON partner, albeit they be way eastward? ……..

Anon May 7, 2013 2:38 PM

Echelon + Carnivore + Room 641A + OCR = All your calls are belong to us…

Anon May 7, 2013 2:45 PM

oops.. doing too many things at once…

ECHELON + Carnivore(NarusInsight) + Room641A + Speech to Text = All your calls are belong to us…

wizardpc May 7, 2013 3:18 PM

Wasn’t Speedbump on the watch list?

Recording all calls from folks on the watch list would be a less-spectacular and more believable technical feat.

cdmiller May 7, 2013 3:23 PM

Interesting take given the earlier thoughts on being able to deal with all the data streaming in from a ubiquitous Google glass deployment…

Alex May 7, 2013 3:41 PM

Oh, c’mon now… I think all of us who monitor this blog are aware of what today’s technology is capable of, as well as its limitations.

First, anyone who is familiar with telecom (and the inevitable wiretaps) is aware of how fragmented the industry and network has become. Want to become a CLEC? File your app with the State, set up a few Asterisk servers and you’re now a telephone company. Sure, it’s not difficult to sniff SIP traffic on an isolated network, but it’s far more difficult in the wild where a carrier might be peering with >50 proxies from various companies. All of my company’s communications travel over encrypted links and carry combined voice/data. The vast majority of our WAN is over microwave links and those are also encrypted, beyond the encryption we already use. We’re not being paranoid, just trying to maintain compliance.

Second, assuming you’re targeting someone specific, the fragmentation of the industry and wide scope of technologies used today isn’t going to make it easy. POTS, land lines, mobile phones (CDMA, GSM, multiple carriers), VoIP, Skype, encryption, etc. Just TRY to follow one person. Chances are you’re not going to be able to intercept 100% of what’s said due to the various pathways involved. Even just tracking a mobile phone in the US is an issue due to incompatible technologies and roaming agreements. Unless you’re going to tail the suspect with your own portable mobile tower like the FBI has used on occasion and FIXED locations, this isn’t much of an option. If you’re being non-specific about your subject/target, then you’re only going to get bits and pieces of a person’s conversations and things taken out of context results in very bad data.

Third, the volume of data. I know there are some very efficient audio codecs available today, especially compared to the aging G.711 workhorse. Just looking at LTE cell towers, from what I gather they’re running ~300Mbps to each of those. Where are you going to put all of that? Even with compression, where would you possibly store it?

Third, part 2, turning large volumes of data into meaning. Ever try sniffing a 100Mbps pipe running full tilt? Even if you’re just looking at data traffic and have an idea of what you’re looking for, the sheer volume of data’s a nightmare to go through and find specific things. As part of my job, I’m routinely going back through old hard drives, old servers, and images thereof, trying to put together patterns, timelines, and extract meaningful data. Even with the best tools I have, ultimately it requires a human being to determine what’s valuable and what’s useless. And that’s looking at data, not voice.

Fourth, where are the interconnects? Yes, there are interconnects between gov’t and telecoms like Room 641a. These aren’t just conspiracy-nut wet dreams/nightmares. BUT they don’t exist everywhere in the system. My office connection’s fed off one of the backbones, and my backbone provider’s very open and up-front about things. I have full access to the racks and even the network equipment itself. Nothing unusual going on. The people who run that backbone are also strongly against most of the legislation from Washington such as SOPA & the Cybersecurity Act of 2012. If anything they’re a bit fanatical about it. Whoever knew guns were essential datacenter kit?

Fifth, speech to text sucks. Don’t believe me? Try using Siri or Google Now while walking down a busy street or in a car with a window down. Even on a good day these systems struggle to make sense of things. Or for more frustration, try calling a large bank or utility company which asks you “please say billing, new customer, technical support.” It can’t even understand those four choices reliably, why would it possibly be able to understand a full conversation? I’ve been dabbling with speech to text since the 1990s and while it’s gotten quite good, I’ll take a human ear anytime over it. We do use it in our office for various things, but so far it’s not good at anything which isn’t close-mic’d in a quiet environment. Add in mobile phone codec compression and things get even less reliable.

So what ARE they using? In many respects, they’ve already told us how they operate:

So that’s what they’re doing…. and I do have strong reason to believe they’re doing more than what they say they are. BUT we also know what the limits of technology are. Are they listening? Sure. To everyone? No. All the time? No. Do they wish they could? Absolutely, and they’ll keep trying until technology gets to a point where they can.

Anonymoose May 7, 2013 4:45 PM

They could be saving the metadata, but I’m skeptical about that too.

I understand you’re likely skeptical, but encourage you to think of it in this context:

All of the metadata is already stored by every telco and ISP for administrative (billing, marketing) purposes. In places like Europe there’s legal requirements to archive it for years.

That this metadata could be saved and indexed with a government budget is not surprising. The technical capability exists, and the storage is now a commodity item that can be purchased from Amazon.

Neither is it surprising that a provider would roll over rather than face pressure in areas in which they have any difficulty complying.

The equivalent of an archived pen trace is not only valuable for commint and network analysis, but also for establishing patterns of identity when crossed with other sources (timestamps are invaluable).

Speaking from past experience, once you reach a certain level of support with vendors — they just send you raw data dumps of networks (IP & Telephony), whether such a thing would or should be legal or not. Mostly it was out of sheer lazyness.

I suspect that if I had a gold badge from a three letter agency behind my name, they would have been more than happy to shove a server in a closet as long as it did not interfere with their NOC and I paid for the rack space.

Clive Robinson May 7, 2013 5:17 PM

Perhaps the question that should be asked is,

What do they need to store?

Followed by the question,


That is how much fidelity is required for each recording. That of basic inteligability or that for presentation in a court of law?

For the over one hundred languages spoken on a fairly regular basis the idea of “speach to text” is a bit of a misnomer, it’s more likely to be “speach to phonem” along with other bits of information indicating the way the phonems are said such as attack and decay times etc.

We know the likes of the NSA where the inventors of many of the basic codecs used to reduce high bandwidth speach several decades ago so perhaps it would not be two surprising if they had systems capable of analysing the call in progress and using some form of variable rate compression based on the inteligability of the phonems, and likewise deduce the language(s) being spoken etc.

We know that plain spoken text can be compressed to as little as 300bits/sec if your interest is spoken content and basic emotion, not speaker identification or background information/noise. In some cases a good deal less.

We also know that the NSA have in the past been the world leaders in storage of data.

But even with good compression algorithms there is still very definately a quality-v-storage issue. But there is potentialy a way around this and it’s to make the compression / quality time based. That is the older the recording the less use it is for legal or other purposes so the lower the quality of recording required and the more it can be compressed to what is essentialy just a transcript. The simplest way to do this would be in effect to make all the various quality levels of recordings at the same time and then at certain time intervals just delet the higher quality recordings unless flags etc had been set.

We also have reason to believe that all the above is currently achivable to a very high number of calls.

However as others above have noted there are issues around getting at the actuall calls.

Lets assume for argument sake (as in the UK) the bulk of calls are routed not along the old System X / Signaling System Seven but TCP/IP. This means that a call can be routed around any capacity issues and a local call of geographicaly just a mile or so could unnoticably to the call participents be routed around upto 100 miles within the network.

This gives an approximate range limit of 50miles to a “Tap Point” which for the bulk of US Citizens is well within the range to their nearest major city.

Whilst this would on the surface of it be a major problem, and would have been a few years ago it’s well within the capability of systems you can go and by from the likes of Cisco.

But whilst this may be possible at the lower levels what of the exchanges etc that sit above these layers. How do you get to corelate the actual switched conversation at the CO level with the individual packets on the underlying TCP/IP level. To do this would require specialised equipment whilst not impossible to make etc it would require a considerable number of specialists.

The reason for this is that although protocols are supposadly standard and open, in practice they may not. That is whilst an exchange manufacturer will support open standrds on it’s ports, the manufacture is likely to have “augmented” the standards when it’s equipment is talking to it’s equipment not that from another manufacturer (this has been seen in the past). Further as has been pointed out in recent times a lot of the exchange equipment used in the US and most other places around the globe are actually manufactured by just a small handfull of manufactures atleat two of which are Chinese.

So whilst it’s technicaly possible to do the mass surveillance of phone conversations I suspect it’s only upto a certain percentage of citizens around the likes of major cities etc.

The real question is if it is going on how much manpower is required and if in the hundred or above how come nobody has realy had a fit of concience and broken the secrecy.

BT May 7, 2013 5:21 PM

It is in the National Security State’s best interests to imply that it can do more than what it actually admits doing even if it can’t really do what it wants to imply, and I wouldn’t put it past certain TLA’s to feed deliberate misinformation to a former employee in order to make sure it gets into the press.

Regarding data requirements: GSM-encoded voice mail, Asterisk’s documentation informs me, takes approximately 500 megabytes per 2,000 hours. I did a quick back-of-envelope on how many disk drives would be required to record 100 million people’s phone calls for a year, and we’re talking about around 14 million disk drives, or about the output of a nuclear power plant just to power them, nevermind the air conditioning, the servers built around them, and so on and so forth. Not happening.

As far as voice transcription, the NSA doesn’t have any better technology than the private sector nowadays, and someone already mentioned how pathetic Siri etc. are. I don’t doubt that they try to do machine transcriptions and stash them back, including the phoneme-based compression mentioned above, but that’s a Hard Job, and I doubt they’re any better than Siri.

Now, digital text communications… I presume that email and SMS and IM are all being logged and stored somewhere. They are compact and easily stored. Same deal with “PIN Register” data (who called whom when). But voice? That’s a Hard Job, and it would surprise me if anything other than “flagged” calls are being recorded continuously.

David Golumbia May 7, 2013 5:30 PM

As far as voice transcription, the NSA doesn’t have any better technology than the private sector nowadays

is the source for this (a) you work there or for a contractor and are releasing classified information on a comment on a blog post; or (b) you don’t work there but have access to accurate classified information on an organization so devoted to secrecy that even its congressional oversight committees can barely do their constitutional duties?

Ian Farquhar May 7, 2013 5:49 PM

As I said on Twitter, this isn’t an issue of technology but the economics of budgets being a zero sum game.

Yes, this is probably technically possible for some very substantial cost.

Is the return on the investment of that cost greater than any of the other possible ways of spending that money?

That’s a hard case to make, although economically, it’s very easy to turn events like the Boston Marathon bombing into pretty much any figure you need. Whether that accumulation of costs represents real impact is an open question.

But there’s another cost here too, and that is the risk inherent in extra-legal interception. However in the last decade, public concern over such investments has almost died, and I am unclear why. The spirit of the Church Commission is well and truly dead, so maybe this is not such a huge consideration anymore.

Ultimately, my point is this: I have no doubt that many in government and law enforcement would want this facility. I believe there are many other things they want too. I just doubt that the ROI on this would get it over the budget bottom line, at least in the way it has been described.

paranoia destroys ya May 7, 2013 5:56 PM

I doubt this is anything more than a bluff.

After all, the government hasn’t been able to track and shut down Rachel from Cardholder Services.

Ian Farquhar May 7, 2013 6:01 PM

One further point I would stress about this whole issue.

In my experience, law enforcement and intelligence habitually allow most people to over-estimate their capabilities.

Having targets over-estimate your capabilities can be strategically sensible. At the very least, your opponent will incur extra cost (especially operationally) in higher levels of COMSEC. This can significantly reduce their operational effectiveness. Think especially about long term operations (terrorism), or large numbers of very unsophisticated operators (drug enforcement).

This consequently allows the agency to target communications using higher security levels, and invest extra effort in compromising those communications (eg. trojanization, side-channel attacks, lots of cops turning up at people’s homes).

If what Clemente says is untrue, I would suspect there would be many in the USG happy about what he has said.

name.withheld.for.obvious.reasons May 7, 2013 6:54 PM

I believe of number of switch and network analyzers, in the class of canivore, can be found on the vendors site.

The specifications include rate of capture, storage, and DPI capabilities,

An informed explanation of how this is deployed might be useful, the way I think about is
from a distribution fabric that allows aggregation–this can be down from OSI layer two all the way to layer 5. Several years ago (okay, 1999) while at a NAP (interviewed for a network security job) they should off their logging capabilities. I won’t go into the details–not here–but there was sufficient
processing and storage for what could be termed snapshot logging of data (including full conversations). I am sure there is someone on this blog that is a telco operative, ATM is probably the best place to perform traffic collection (in the CO just dup packets).

Ano May 7, 2013 8:06 PM

Some of you are terribly, terribly naive.

Look up Bill Binney and the Utah Data Center.

This CAN be done and IT IS being done. Just because you wish it weren’t so doesn’t make it not so.

whodunit May 7, 2013 8:18 PM

Wikipedia describes the basic requirements of the Communications Assistance for Law Enforcement Act (CALEA) wiretap system if you want to know what can/is being done by the telcos.

The practical reality is that to tap ALL calls, the telco would need to provide connectivity to the CALEA system roughly equal to the total busy hour call capacity in each switch at each location across the country. Since that would effectively double the Telco’s operation cost, I really doubt that it will ever happen — and you will see it in your bill if it does.

Stratego May 7, 2013 10:08 PM

Bruce, all,

If you haven’t seen them I recommend you watch everything that Bill Binney has said publicly over the last several years as well as Thomas Drake (executive at NSA).

William Binney at Defcon:

Thomas Drake on RT:

Laura Poitras interviewing William Binney “The Program”:

Also, I would recommend everyone read: “The Shadow Factory” by James Bamford

I don’t know if the NSA is monitoring every phone call. I know how they work and find it difficult to believe that they are monitoring everything. It’s the US Government after all. However, I certainly know for a FACT that they are illegally monitoring US person’s calls, and dragging in loads of US citizen and US person data, public and private.

NSA has had tons of problems over the last few years simply with data center power. They can’t get enough power from the local utilities, hence Bluffdale, UT and a temporary Austin, TX location.

Either way, it is pure TREASON. I don’t say that lightly. Some may think it’s cliche, but I did take an oath to a piece of paper that nobody seems to give a damn about anymore, and have had friends and family killed over this insanity.

I just hope the rest of those/you that aren’t awake start catching up quick. It’s much worse than you think, less in some ways and much worse in areas you had likely not considered.

If the “elected” and appointed officials of the US .gov from the last 2 to 3 decades had been put to a fair trial like Nuremberg, most wouldn’t be with us any longer. It’s really, really bad. wake up. please. If you’re already awake, well, help spread the truth.

Hopefully we can avert another civil war, but honestly it’s not looking good.

kingsnake May 7, 2013 10:20 PM

Remember: Never put into electronic form that which you do not wish others to know.

Drew May 7, 2013 10:37 PM

ok Stratego, lets bring the conversation back to reality here.

William Binney, what are his bona fides? Everything that I could find is that he was a mathematician, so it’s likely that he worked in cryptogrophy, and not likely that he worked in SID or IAD where he would have access to the compartments that dealt with SI beyond having SI/TK which just about EVERYONE in the IC has. Whatever bone he has to pick let him pick it but seriously do the back of the envelope, oh wait yeah NSA has a secret nuclear reactor and acres of hard drives under Meade where they’re recording all of this stuff and the secret army of millions of analysts to actually analyze the data.

RT!? seriously, the kremlins propaganda mouthpiece? I actually rank them BELOW FOX news.

So YOU know how NSA works? What are your bona fides?

In fact what makes any one of you think that you are special enough to even MERIT having your conversations recorded.

Never mind, there is a vast global conspiracy of tens of thousands of people and not ONE of them has actually said anything beyond conjecture.

Bruce Schneier May 7, 2013 10:40 PM

Interesting comments. I think it’s worth going through the math. There are two possible ways to do this. The first is to collect, compress, transport, and store. The second is to collect, convert to text, transport, and store. So, what data rates, processing requirements, and storage sizes are we talking about?

Nick P May 7, 2013 10:57 PM

I think the thread is missing a greater point. We already have previous testimony on their capabilities. All of those point toward features that are targeted. They may be targeted for certain people, locations, words, etc. They’ve known since Echelon that the data is a firehose they can’t drink from. Whatever they store, they probably filtered first.

Jason May 7, 2013 11:24 PM

You’re right: to decide whether the government can record everybody’s phones, you need to do the math. But it’s a matter of bucks, not bits.

The NSA has an estimated budget in the ballpark of $10 billion. (Estimated because the actual number is secret.) The budget of Google, Inc. is around $50 billion. Google is the world’s most successful surveillance corporation, but their face and voice recognition systems fall apart at the slightest hint of noise or shadow, they won’t allocate more than 5 gigs of storage to me unless I pay them, and their satellite surveillance of my house is a year old and 1/2 meter resolution.

If you expect the NSA to do better, they must be spending money at least 5 times more efficiently. But no matter how sinister they are, they’re a government agency, so that ain’t bloody likely.

Andy May 8, 2013 1:16 AM

Neil tried to run some numbers. I am not convinced that I’m not missing something, but I think you can make a 1-slot card that installs in each base station, compress voice there using a Speex-quality codec, and commandeer 1 Mbps of uplink to get the data to a fairly small storage facility. If it’s spindles, only a few racks need to be powered up at any time (offline HDD storage is a well known tech) and tape is even denser.

Given that design sketch, I’m pretty sure the feds could record every ILEC landline and major cellular provider call for less than $1 billion up front plus $100 million/year operating costs. As Clemente intimates, the “NATIONAL SECURITY” security-blanket warm fuzzies from having that data for retrospective analysis would be immense.

On the other hand, the “let your adversary overestimate you” argument that Ian raises is a good one.

All we really know is, there’s a lot we don’t know.

BT May 8, 2013 1:36 AM

I’m with Jason here. Look, I have no trouble believing that there are certain technologies where the NSA is ahead of us here in the Silicon Valley, because there are things that we basically don’t care about because you can’t make money doing them (at least not legally). But on the technology that I’m directly involved in, i.e. storage, I’ll bet a dozen herring that they can only handle targeted communications, because we’re the people driving the state of the art, not the TLA’s that keep showing up hat in hand begging us for our technology.

If we don’t have the technology here in the Silicon Valley (and Silicon Valley NorthWet up in Oregon and Washington) to do more, with all the hundreds of billions of dollars and all the smart brains and giant supercomputers that we have (far beyond anything that any government agency has ever had the budget to put together since the heyday of the Manhattan Project), the chances of someone else doing it with a much smaller pool of capital and brainpower are pretty much zero. At least, for the things we’re interested in. And yes, voice (and image) search is a huge area of interest here in the valley right now… I can’t say too much there though (NDA, sigh, which is why I’m a bit vague on some other things too). Let’s just say we’re talking huge investments by the biggest names in the business…

Winter May 8, 2013 2:31 AM

@Clive Robinson
“For the over one hundred languages spoken on a fairly regular basis the idea of “speach to text” is a bit of a misnomer, it’s more likely to be “speach to phonem” along with other bits of information indicating the way the phonems are said such as attack and decay times etc.”

No. Phoneme to text is very bad in itself (low recognition rates and phoneme-stream2text is an “interesting” problem), but simply impossible when the language is not known.

Best bet: Language recognition followed by standard Automatic Speech Recognition.

You would have to update your vocabulary constantly (can be done over twitter or something like that). Even the top of the line systems using broadcast quality TV speech will get only 30% or so correct.

State of the art upto 2009 can be seen at NIST:

ASR quality is limited by speech and text corpora. So any secret projects with massive funds can outdo public groups (and these charts) by collecting more speech and text. However, that can be easily modeled by extrapolating a year or so.

Ky Waegel May 8, 2013 3:07 AM

Let me try to put some hard numbers on the storage required. I’ll assume that speech-to-text is not good enough, or that we want to keep the audio around for some other reason.

Assume the NSA is using something like Codec2 at 1400 bits/sec ( 175 bytes/sec). That’s 10.25 KB/min/person.

Extrapolating this to the entire country (310 million people), we get about 3 terabytes per minute, or about one large hard drive.

Amazon S3 glacier storage is about $0.01/GB/month, so storing one month of recordings would round up to $31/month. At 500 minutes of talking for each person (average) per month, that gives us $15,500/month ($186,000/year) to record the entire country.

The storage side, at least, seems relatively easy.

I know less about phone networks, so I’ll let others discuss the capture side of the equation.

(Someone else feel free to check my math. Peer review is a good thing.)

Ky Waegel May 8, 2013 3:15 AM

@Andy, I didn’t notice your link showing Neil Kandalgaonkar doing roughly the same calculations. Nice to know I didn’t miss something, though.

Winter May 8, 2013 3:20 AM

CERN is possibly the biggest public project working on this data scale.

CERN runs petabyte data centers

And world-wide LHC-related data is 7.5 – 10 gigabytes per second.

Somehow, I think telephone speech is not a challenge against this capacity.

Telephone speech is 8kHz 8 bit digital audio ~ 8kB/s per conversation. At 10GB/s that is 1,250,000 uncompressed concurrent calls (~1% of the US population calling).

At full speed, that is somewhat less than a petabyte per day, uncompressed speech.

With some data mining techniques you have a few days to whittle down the uninteresting calls and compress the rest to keep 0.1% of the calls for years.

Seem feasible. Useless and expensive, but technically feasible if you can get direct the access to the pipes.

Clive Robinson May 8, 2013 3:45 AM

A thought occured over night about people not speaking out.

During WWII the Ultra project involved litterly tens of thousands of people directly and indirectly, yet it was not untill the early 1970s nearly thirty years later the story got out via Freddy Winterbottom’s book.

Now I know things were somewhat different back in WWII but it does indicate it is possible to keep such large scale secrets.

That said even though as I and others have indicated the technology is certainly possible, I don’t think the current phone system is set up in a way to make the process either reliable or efficient.

The phone system can be broken down into two almost entirely seperate systems, Network and Billing. The network is designed as a very de-centralised system and billing centralised. Between these two is the glue that holds it together that is the DBs containing number conversion and routing information.

So whilst getting mass pen&trace info works because of the centralised billing system makes it relativly efficient the same is not true for the actual call contents, the de-centralised nature of the network makes it considerably less efficient.

The way to make collecting call contents more efficient is to sit on the nodes between CO’s, and collect non CO local calls. And over time as the network gets upgraded reach out towards those local calls.

Look at it this way, the cost of puting in new cables and links is generaly high due to infrastructure costs, but the cost of pulling two cables at the same time is a very small fraction of that infrastructure cost (it’s why there is so much dark fibre currently just sitting there).

So whilst recording every phone call may not currently be possible due to capacity issues and having the network structure against it, those will change over time.

ATN May 8, 2013 4:16 AM


The network is designed as a very de-centralised system…

Still you can download the day calls during the night, no need to do it real time.

Tom Stone May 8, 2013 7:21 AM


Using adpcm encoding, you store 4 bits per sample with 8000 sample per second. This works out to about 500 bytes per second which would allow storing 23 days of continuous speech on a single 1 terabyte hard drive.

AtomBoy May 8, 2013 7:52 AM

Wiretapping was ruled as illegal shortly before WWII. Hoover continued it anyway. He even tried to use wiretap evidence in a court case, though it was already ruled as illegal.

They continued it, under presidential “okay” (which was shaky) despite these laws. They were just cautious not to use evidence collected from wiretaps in court of law.

They very well could be doing this again.

Dirk Praet May 8, 2013 8:27 AM

I’m with @Alex and @Nick P. on this. Even under the assumption that it would be technologically possible to do so, it doesn’t make any sense to indiscriminately and continuously monitor, record and store the domestic calls of about 310 million people.

Perhaps they’re doing it for the 21k people on the no-fly list and the 700k on the Terrorist Identities Datamart Environment. Most probably there are several additional watch lists whose existence and criteria we are unaware of, but taking it any further than that in my opinion is more likely to create more problems than it would solve.

There also is the legal aspect of the matter. In absence of a warrant for a wiretap that has revealed potentially incriminating information, how can any of it be upheld in a court of law ? Unfortunately, I have little doubt that legislation is under its way to tackle exactly this type of problem and probably even already is in place, beit mostly unknown to the general public due to secretive interpretations of certain sections of it. NDAA and the “2511” immunity letters companies like AT&T received for the warantless wiretapping under the Bush administration come to mind. And then of course there is also CISPA, that according to many retroactively inscribes into law certain activities the government so far has been carrying out in secret.

Christian David May 8, 2013 9:19 AM

I agree that doing the math is the best way to arrive at an answer. If it is close to feasible, it will be done, or at least attempted.

But a linear extrapolation of past superiority and achievements by the intelligence agencies to the present is problematic, as already pointed out by @BT and @Jason. Much of what used to be core intelligence agency activity has become the focus of mainstream business. In those areas where their interests overlap, the combined financial, technological and human resources of the mainstream industry dominate those of the intelligence agencies. NSA et al. had a head start, but maintaining it must have become difficult.

JeffH May 8, 2013 10:10 AM

So, US Census lists population to monitor at around the 300M mark. US Department of Labor ( estimates that people spend an average of just under an hour a day on the phone (actually phone + email + social). Info in this blog suggests 500 bits per second to encode the data. Unless my maths is wrong (and as someone said earlier, peer review is good), that makes it about a terabyte a day to deal with, whether that’s process, compress, store, whatever. That is, 300M * 60 seconds * 500 / (8 bits per byte * 1024 * 1024 * 1024 * 1024).

Doesn’t sound that impossible to me from a technical perspective, and if you preserve metadata you can always go back and search your archives later. Why throw the data away if you can data-mine it at your leisure?

As others have said, the bigger issue is getting the data to & from places & ascribing accurate metadata to the data. Where is it going? Where is it coming from? Whose conversation is it anyway?

JeffH May 8, 2013 10:12 AM

Heh – make that 60 terabytes a day. Really should have read that before I pressed post. Kinda missed an important 60 minutes off that equation.

Z.Lozinski May 8, 2013 10:40 AM

Let’s get out the envelope. Looking at the UK – a developed economy of 63M people, because the data is public. The data source is the Communications Market Review 2012, published by Ofcom (the UK telecom regulator).

SMS Metadata only.
The number of SMS sent in the UK 2011 is 151 Billion. To store the metadata, we will need 2 phone numbers and a timestamp. Each phone number is a 15 decimal digit IMSI (International Mobile Subscriber ID). Timestamp is yyyymmdd hh:mm:ss so another 14 decimal digits. Total 44 bytes. (Yes, I know about compression, and packing and I am ignoring it).

44 bytes * 151 Billion texts = 6644 Billion bytes. That is 6.5 Terrabytes for one year of SMS metadata.

SMS Metadata plus contents.
Assume each SMS is the maximum 160 characters, and ignoring MMS. 160 bytes of content + 44 bytes of metadata is 204 bytes.

204 bytes * 151 Billion texts = 30804 Billion bytes. That is 31 Terrabytes for one year of SMS content and metadata.

Voice Calls Metadata only.
There were 115.9 billion minutes of outgoing fixed voice calls and 122.6 billion minutes of outgoing fixed mobile calls in the UK in 2011. Assume a fixed call is 3 minutes duration and a mobile call is 1m 40s duration. That is 38.7 Billion fixed voice calls, and 73.4 Billion mobile voice calls. You probably want to add duration ( to the metadata which is now 50 bytes.

Fixed voice metadata is 50 bytes * 38.7 Billion calls = 1935 Billion bytes. That is 1.9 Terrabytes for one year of fixed voice metadata.

Mobile voice metadata is 50 bytes * 73.4 Billion calls = 3670 Billion bytes. That is 3.7 Terrabytes for one year of mobile voice metadata.

Voice Calls Metadata plus Content.
The ‘natural’ rate for voice on a fixed telephony network is 64kbit/s using the G.711 codec. (Yes, you can bring this down by an order of magnitude with a different codec). 64kbit/s * 60 sec is 3840 bytes/min. While within a mobile network the 13kbit GSM codec is used, or even the half-rate codes, but let’s keep with the PCMA standard.

115.9 Billion fixed minutes * 3840 bytes/min = 445056 Billion bytes = 445 Terrabytes.

122.6 billion mobile minutes * 3840 bytes/min = 470784 Billion bytes = 471 Terrabytes.

Say 1 Petabyte per year.

It looks technically feasible, so it is entirely down to the legal framework, which of course varies by country.

The equivalent calculations for web browsing, instant messaging, facebook posts etc. are left as an exercise for the reader.

1. Ignoring the distinction between trillions, terrabtes and tebibytes.
2. Why ignore MMS? Average monthly SMS text messages for 2011 of 199.7, average monthly MMS picture messages 0.8. (Ofcom CMR 2012)
3. Call duration for fixed voice from “Notes on the Network”, Bellcore, 1994.
4. Call Duration for mobile voice:
5. Paradoxically, if you got the metadata from the CSP, in the form of CDRs (Call Detail Records) it would probably be around an order of magnitude larger as CDRs contain all sorts of junk.

Drew May 8, 2013 11:01 AM

I see a lot of reasonable equations as far as storage space goes, but that’s only half of the equation, I don’t have time to tackle it right now, but now what about the heat/space/power? AFAIK no TLA has a space station hidden on the dark side of the moon with a fusion reactor that is powering these hundreds of thousands of hard drives.

So here’s your constant: 67 terabytes ( and for a Pbyte you’d need 16 of them so 1.5 racks per Pbyte just for physical space, now calculate power+hvac power etc… BBIAW

If anyone wants to do the math while I’m tackling honey dos I won’t complain.

Major Variola May 8, 2013 11:10 AM

Many commenters are incredibly politically correct. Its causing their maths to be too large.

100 million calls? Less than 10% are Muslim or immigrants. The notion of carefully maintained “Watch lists” based on evidence is quaint. Get serious.
A surveillance state need not be politically correct (or even “legal”) internally.

Recording Vox from all immigrants, and all Muslims, is a much more tractable problem. Sure, you wouldn’t be able to e.g., learn Tim McVeigh’s OK exploits, but that’s not the “demographic of interest”.

Or what the hell, add govt-suspicious folks too, they’re only a small percent and as readily identifiable as Muslims.

Drew May 8, 2013 11:35 AM

But Major, this isn’t about targeted intelligence collection. People like Tom Clemente and William Binney aren’t running around getting press and telling everyone who will listen that the sky is falling and the world is burning because the USIC is doing TARGETED intel collection, they’re running around screaming at anyone who will listen that the USIC is doing BLANKET intel collection.

Please if you’re going to use the title Major please act like you’ve paid attention to the entire conversation (or at least read the initial blog post).

comsec May 8, 2013 11:48 AM

One thing I have noticed is most ex NSA and CIA all go into the private data mining and intel industry. There is obviously a demand for mass surveillance and the wikileaks spyfiles pages identify a lot of these gov data mining contractors operating in almost every country.

It doesnt seem to work though, if they are mining for keywords. How many domestic attacks had warning like Dorner writing his online manifesto days before his killing spree and the boston bombers who were using clear phones.

Another good way to tell what they are up to at the utah datacenter is look online at their job postings. If they are looking for hundreds of database jr admins with bioinformatics backgrounds to parse data and keywords then ruh-roh… mass spying has begun

JeffH May 8, 2013 11:48 AM

@Major Variol

You can profile your data mining, sure, but from what those who sound like they know more about telcos than I have said, it doesn’t appear that you can just record the individuals you want to, across all the different ways & routes to communicate, at scale.

If you can’t spot individuals in amongst the weight of traffic in real time, then you have to record everything then later figure out who said what to whom – akin to ‘record them all and let God sort them out’.

One could end up with a distributed infrastructure capable of recording everyone and have it analyze in real time whether the metadata of traffic is of interest and decide whether to record it. If storage is cheap & bandwidth available to route everyone’s recordings to available storage, why go through the hassle & expense of large scale real time analysis?

(I am assuming this sort of thing would have to be distributed, as I’ve not noticed anyone running every single phone & data line in any country through a single centralised point, but maybe the data could be coerced to always route through one or a few specific hops?)

name.withheld.for.obvious.reasons May 8, 2013 11:58 AM

The “National Security Corporate Military Services Complex” has already tipped their hat on this one. Parsing some of the language from reports about the Boston bombing, LEA advocates for suspending the rights of the suspect(s) suggested that it was perfectly fine to interrogate the citizen suspect. The “out” for LEA types was the infomation collected while captive could not be used in court. Copying all calls seems reasonable in this context? The NSA has had an out for any domestic taps, routing calls through Canada.

Shawn Smith May 8, 2013 12:07 PM

@Z.Lozinsk at 10:40am May 8,

64kbits/sec * 60 sec/min = 3840 kbits/min = 480kBytes/min, not 3840 bytes/min. You’re off by two orders of magnitude on the voice content calculation.

Z.Lozinski May 8, 2013 12:20 PM

Interesting, and related development from India, about the Central Monitoring System.

Note the article below doesn’t distinguish between data retention (the metadata about communications) and lawful interception (the content of the communications).

Times of India article:

“Government can now snoop on your SMSs, online chats” – Indu Nandakumar, ET Bureau | May 7, 2013, 05.46 PM IST

BANGALORE/DELHI: The government last month quietly began rolling out a project that gives it access to everything that happens over India’s telecommunications network—online activities, phone calls, text messages and even social media conversations. Called the Central Monitoring System, it will be the single window from where government arms such as the National Investigation Agency or the tax authorities will be able to monitor every byte of communication.

However, Pavan Duggal, a Supreme Court advocate specialising in cyberlaw, said the government has given itself unprecedented powers to monitor private internet records of citizens. “This system is capable of tremendous abuse,” he said. The Central Monitoring System, being set up by the Centre for Development of Telematics, plugs into telecom gear and gives central and state investigative agencies a single point of access to call records, text messages and emails as well as the geographical location of individuals.

Duggal, who closely follows New Delhi’s battle with internet firms, said there hasn’t been much details from the government on what exactly the system intends to monitor and under what conditions.

In December 2012, the then information technology minister Milind Deora told Parliament that the monitoring system, on which the government is spending Rs 400 crore, will “lawfully intercept internet and telephone services”.

AlanS May 8, 2013 12:30 PM

If the only limitations on the surveillance state are technological ones (transport rates, storage, voice to text, etc.) we should be very worried. The real question isn’t whether it’s feasible or, if it is, whether they are doing it but why anyone in their right mind would consider this acceptable in a liberal democracy.

The state is clearly pushing surveillance to the technological limit and existing legal protections appear to be either ineffective or ignored.

FBI says it doesn’t need a warrant to snoop on private email, social network messages

Z.Lozinski May 8, 2013 12:47 PM

@Shawn Smith,
Thank-you, peer review is a wonderful thing, though missing kbits = bytes is just embarrassing. Now how do I edit the post?

Carpe May 8, 2013 12:50 PM

  1. Stop doing math on what it would take to store everything. They are using database scraping of metadata to then focus needed storage. If you want numbers you have to focus on this process.
  2. If you think the NSA is waiting for the Utah datacenter to come up to meet the space requirements needed for this, you haven’t been paying attention.

  1. While the lead the NSA/CIA DoST has been compressed from the traditional 20 years, they still have roughly a decade on the private sector, so stop saying $technology sucks ergo they can’t have anything better.

  2. Keep in mind that according to Binney in the 90’s they were looking at 20TB/min.

  3. Don’t forget sat’s are tapped too, not just major backbone nodes.

  4. Don’t assume just because the people managing a major node say there is no NSA room that there isn’t.

  5. Many with knowledge argue this is the result of incompetence and greed. (revolving door DC style). I argue that this is being done systematically with a larger long term goal in mind, which I will not speculate about here unless asked to.

Alex May 8, 2013 1:06 PM

I’m still of the opinion that getting USABLE data out of recording calls remains the problem.

Going back through personal experiences: In one of our court cases, we had over 1 million documents in our virtual data room. Not pages, but >1M multi-page documents. The actual page count escapes me now, but it was substantial. For all practical purposes, we had every document, every e-mail, and every call log the company had ever done. The real problem was how to go sift through it.

Fortunately, OCR & AI technologies have come a long way and were very useful, however at the end of the day we ended up hiring a certain large offshore company in India and ultimately human brains & hands were what put the proper metadata on the documents and handled data requests. And these were documents, not voice calls.

It’s not to say that in the future this won’t happen, but for now it’s not practical. If it were, I think these Indian companies would have something equivalent to offer their clients.

Z.Lozinski May 8, 2013 1:27 PM

Let’s get out the envelope. Looking at the UK – a developed economy of 63M people, because the data is public. The data source is the Communications Market Review 2012, published by Ofcom (the UK telecom regulator). I’m just calculating data volumes as one input to the debate. You also need to figure what it takes to structure, organise and search this volume of data.

(Revised, thanks to Shawn Smith).

SMS Metadata only.
The number of SMS sent in the UK 2011 is 151 Billion. To store the metadata, we will need 2 phone numbers and a timestamp. Each phone number is a 15 decimal digit IMSI (International Mobile Subscriber ID). Timestamp is yyyymmdd hh:mm:ss so another 14 decimal digits. Total 44 bytes. (Yes, I know about compression, and packing and I am ignoring it).

44 bytes * 151 Billion texts = 6644 Billion bytes. That is 6.5 Terabytes for one year of SMS metadata.

SMS Metadata plus contents.
Assume each SMS is the maximum 160 characters, and ignoring MMS. 160 bytes of content + 44 bytes of metadata is 204 bytes.

204 bytes * 151 Billion texts = 30804 Billion bytes. That is 31 Terabytes for one year of SMS content and metadata.

Voice Calls Metadata only.

There were 115.9 billion minutes of outgoing fixed voice calls and 122.6 billion minutes of outgoing mobile calls in the UK in 2011. Assume a fixed call is 3 minutes duration and a mobile call is 1m 40s duration. That is 38.7 Billion fixed voice calls, and 73.4 Billion mobile voice calls. You probably want to add duration ( (a extra 6 bytes) to the metadata which is now 50 bytes.

Fixed voice metadata is 50 bytes * 38.7 Billion calls = 1935 Billion bytes. That is 1.9 Terabytes for one year of fixed voice metadata.

Mobile voice metadata is 50 bytes * 73.4 Billion calls = 3670 Billion bytes. That is 3.7 Terabytes for one year of mobile voice metadata.

Voice Calls Metadata plus Content.

The ‘natural’ rate for voice on a fixed telephony network is 64kbit/s using the G.711 codec. (Yes, you can bring this down by an order of magnitude with a different codec). 64kbit/s * 60 sec is 3840 kbits/min or 480kbytes/min. While within a mobile network the 13kbit GSM codec is used, or even the half-rate codes, but let’s keep with the PCMA standard.

115.9 Billion fixed minutes * 480 kbytes/min = 55632 Trillion bytes = 56 Petabytes.

122.6 billion mobile minutes * 480 kbytes/min = 58848 Trillion bytes = 59 Petabytes.

So 115 Petabytes per year. That’s 2 Petabytes per million people.

It might be better to think of this as 115000 Terabytes. This could be implemented with 60000 commodity x86 servers each with 2 x 2 TB disk drives. (This is moving to the infrastructure designs used by the large scale internet providers like Amazon, Facebook and Google). The biggest technical challenge is replacing failed servers, but Akamai cracked that one about 10 years ago, and don’t bother.

It looks technically feasible, so it is entirely down to the legal framework, which of course varies by country.

It looks economically feasible, thought the cost is probably over $/€ 1 Billion. Then factor in the clever software to make use of the data.

Operationally it looks like hard work. Usenix and GigaOM have regular reports on running large scale infrastructure, and it isn’t easy.

So, today this is probably only an option for a few countries with a strong technology sector. One of the worries is that as technology improves, this will be within the scope of countries without a technology sector within a decade or so. (Think of all the communications monitoring gear found in Libya after Gaddafi’s overthrow).

The equivalent calculations for web browsing, instant messaging, facebook posts etc. are left as an exercise for the reader.


1. Ignoring the distinction between trillions, terabytes and tebibytes.

2. Why ignore MMS? Average monthly SMS text messages for 2011 of 199.7, average monthly MMS picture messages 0.8. (Ofcom CMR 2012)

3. Call duration for fixed voice from “Bell Operating Company Notes on the LEC Network”, Bellcore, 1994. 

4. Call duration for mobile voice:

5. Paradoxically, if you got the metadata from the CSP, in the form of CDRs (Call Detail Records) it would probably be around an order of magnitude larger as CDRs contain all sorts of junk.

Nick P May 8, 2013 1:31 PM

Alright, I guess I’ll join in on the fun of “what if” analysis. My previous post is about what I think they ARE doing. Here’s my essay on what they MIGHT do.

So, everyone’s looking at the data to be captured. Let’s try another approach. Let’s determine what a massive NSA datacenter can do by looking at existing massive datacenters. Facebook is an obvious example. Funny thing is that they already do what NSA wants to do. Last I checked, they keep people’s metadata and data going years back. Facebook has minute-by-minute status updaters, long conversations w/out SMS msg size limits, pictures, videos, internal data, etc. It’s a bunch of data. They have a huge number of users, many whom are active. What does it cost to manage that? Well, most yearly capital investments I see on facebook are half-billion to $1+ billion. The NSA has $7 billion in that one datacenter. I’ll let the rest of you do the math on that one. 😉

The claims of Echelon and ThinThread tell us something else. Their dataprocessing system is multistage. They have computers at the source of collection. They have computers in the middle for additional processing and routing. Then, they have backend computers for storage, automated analysis, and human analysis. This design means they can use special purpose computers at many source points in mobile, fixed line, and ISP infrastructure. Each computer will have hardware accelerated compression, probably link encryption, and preliminary analyses methods. These analyses/filtering choices may be remotely updated. They might also build hubs throughout the US for the middle end. Then, they have massive datacenters. This multistage design can reduce volume in both storage and transport.

One commenter pointed out UK’s total SMS. Extrapolating from that, it would seem that they could store all of US SMS messages. Probably analyse them in reasonable time periods too. Why would I say that? Well, doesn’t Google do exactly that for much larger amounts of text? And we have things like Netezza appliances that give plenty of storage with analytical acceleration at hardware level. No doubt they will use such technology to their advantage. I’ve always assumed SMS would eventually be captured wholesale at least for suspicious people.

That brings me to suspicious people. One commentor mentioned Muslims were a fraction of the population. The govt has plenty of people that worry it. There’s the zealous activists of many flavors, the possible whistleblowers (on inside or outside), journalists, risky ethnic groups, etc. Put all their enemies together and they’re a small fraction of average population. Focusing on them first would solve plenty of storage issues. Plus, who says everyone gets equal storage, treatment or data longevity?

Leads me to prioritization and data lifecycle. I think the NSA might play it smart. They’ve been in the data collection and analysis game longer than most others. They know the issues. I think they will break the datacenter into different pieces much like they did other collection efforts. One group of machines might have fast drives, good analytical capabilities, etc. A larger group will just be storage. They might use separate links for regular data capture traffic and internal traffic. The latter might be used for administrative activity and to move data between different groups of machines. They can use regular databases to keep statistics on people, including ratings of how important they are. I think there will be a rotation period where data can be cycled out if it’s not important, with that storage being used for incoming data. (Like a FIFO queue.) So, for now, they can’t keep everything, but they can gradually increase how much they retain or how useful it is.

Simultaneously, they will be using graphs. Intel agencies love graphs. The SRD tool is an example of the kind of analysis I think they’ll be trying to do on a large scale. No doubt they will use one of these famous NoSQL graph databases to try to connect people. They also have years of experience modeling the kinds of connections they’re interested in. Whatever they’ve been doing without a datacenter in this regard they will move into the datacenter. They might even have another dedicated cluster to let human analysts access individuals’ or groups’ profiles for more thorough analysis. Also, to remove possibly erroneous information and fine tune the system. They might even have a group of machines dedicated to permanently storing all collected data and analysts’ reports on people they’ve decided merit that.

We also know they invest heavily in technology that automates analyses. Text-to-speech has issues no doubt. However, they can focus that a bit more too. They can have different engines for different kinds of people and languages. They might choose to use it despite errors that mistranslate or loose words. The reason is that the text form gives them MUCH more data to store and they’re used to working with information in pieces. So, why not? I still think they’ll mostly target such processing efforts so they have better intel ROI. They’ll also put effort into into keyword analyses. They’ll definitely look at call length, how long people have had the number, how often they switch numbers, what provider, etc. These often indicate someone is trying to keep a low profile and/or determine future collection tradeoffs. One random thing to throw in there: most Middle Eastern immigrants I see in our area talk on the phone a lot, usually to the same people, and rarely text. Such predictability allows for optimizations.

There’s two more options for them to do. The first is obvious and long used: get cooperation from vendors to tap their stuff. We saw that with phone companies. The second is to build or acquire technology stacks that their targets will trust. (Crypto, VOIP, social networks, etc.) I won’t be surprised if I see cheap mobile operator pop up that focuses on a US govt target group, has a secure VOIP capability, whose executives are stopped at airports a lot, whose homepage is “somewhere outside US jursidiction,” who the FBI “failed to crack,” (let a few small fish loose…) and whose members keep getting busted in ways that don’t implicate the mobile operator (…to capture big fish).

The trick is never providing their NSA data as evidence. They merely tipping off FBI/DHS to closely look at certain people. Then that group will collect more typical evidence that is used against the targets. This kind of operation might run for a long time if the LEO’s properly compartmentalize information on what source of data was in such cases. Such massive communications stacks under NSA control are easier to connect to their backend datacenters. Heck, they might even host them there. Those that are discovered can be abandoned and new efforts created.

That’s pretty much my thoughts on the matter. If I was NSA, I might try to do it as I’ve outlined above.

Carpe May 8, 2013 2:20 PM

@ Nick P

Only correction I have would be regarding the use of the data. I would argue that the goal is retroactive punishment via data that can be used to “walk the cat back”. Essentially, dissent too much too successfully, and be maliciously targeted. Later down the road predictive arrests are also likely.

So I think it’s a bit nastier than you make it sound on that end, but like your analysis otherwise.

Clive Robinson May 8, 2013 4:58 PM

@ Bruce,

There are two possible ways to do this. The first is to collect, compress transport, and store. The second is to collect, convert to text, transport and store.

As you noted above there is more than one way “to skin a cat” on this, and I actually think the number of ways is considerably more than two, and importantly will evolve with time and technology as you cannot put a system such as this in place over night nore upgrade it all over night.

It appears the basic math on storage checks out and appears reasonable. However there is a lot more to it than just the basics of storing the data at an assumed bit rate. As I observed above the data compression / quality can be aged from high quality down to text transcript over time. And further can even be deleted with time and other considerations (you most likely don’t need to keep a recording of a seven year old describing their birthday party to granny for more than a few years and only as a text transcript after a few months). Thus with time primary storage is recovered, in essence the storage would be multilevel and most would end up off line fairly quickly. This is much like the multi layer memory behaviour on computers with CPU register, CPU cache, Main RAM, Hard Drive, backup tape, with access speed varying by 10^16 (ie pico sec down to 30day backup).

Thus we need to consider not just hard drive storage but high speed RAM and tape tractor systems. When you do this other factors arise which have important considerations.

For instance, as has been noted there is a power issue, having lots of hard disks with their platters spining almost continuously is going to make the lights dim in most major cities and towns if you tried putting them all in one place (which is perhaps unlikely on the “all the eggs in one basket” principle). When you start considering various storage technologies you end up with quite a complex power calculation per bit of storage. My guess is that hard drives will not actually be much more than a (very) high capacity buffer between RAM used for processing / searching and more permanent storage such as tape tractor farms.

But overall power and power per bit are just two of many considerations when you are looking at an operation that big, especialy when many things such as the environment are beyond your control.

Almost your first assumption would be “no single thing is 100% available / reliable / upgradeable” thus you would look at distributing it as a first measure (on the “divide and conquer” as well as “eggs in basket” principles).

Mind you even with carefull planning it can still go wrong. For instanceGoogle that does this to a certain extent has had and will continue to have occassional outages, and so for that matter will the telephone network. The important thing is “routing around a problem” which is considerably easier in a distributed de-centralised system.

However being widely distributed appears to fly in the face of fast data processing for searching etc. Which I guess is what Utha is all about.

But… it’s important to remember that searching and collection are very different in their charecteristics. Collection is a real time activity which needs nigh on 100% reliability and down times to be less than the network being monitored. Searching is far from real time and down times of an hour or two are acceptable if they occure infrequently.

Thus if it were down to me I certainly would be thinking seperate systems for collection and searching with a “store and forward” design where the collecting nodes stored in real time and forwarded as and when they could. It would also be helpfull if the collection heads could do some pre-prossessing over and above just compression (more of which later).

This also has the advantage that new technology needs only to go in at a modest rate with a reasonable life time (say five years).

But to be honest I think Utha is almost a “stop gap” measure, because the reality is a de-centralised system is going to be the best option in the long term partly because as we’ve known for quite some time centralised systems are “targets of opportunity” and partly because technology in the commercial sector is outgrowing user abilities.

And it is this last factor people need to think about carefully because the ultimatly decentralised system would run almost entirely on the end user device or “smart phone”.

People complain about the current voice recognition on their smart phones but it is noticably improving and will do so more quickly with time (because users despite their grumbles see it as useful technology). One advantage of having it on the end user device is it can learn the end user and thus get more recognition for less CPU power than a system that cannot learn individual users.

Thus it won’t be to long before your phone can speach to text everything you say. And as was seen with a commercial piece of software (CarrierIQ) the networks will quite happily do an end run around user security and send all SMS and other activities in the clear across the internet and thus do a big chunk of the NSA’s job for them…

PBIPhotobug May 8, 2013 7:36 PM

@BT “and giant supercomputers that we have (far beyond anything that any government agency has ever had the budget to put together since the heyday of the Manhattan Project)”

I just looked at the top 50 in the Top 500 supercomputer list and didn’t see a single Silicon Valley site. Are you saying you have faster computers than these and aren’t claiming bragging rights?

Bruce Clement May 8, 2013 7:47 PM

Several people have commented on the power requirements of keeping several years worth of voice data constantly spinning.

I don’t think this is necessary. If you store the metadata separately, you only need to spin up the drive with the digital speech you want to hear once you have decided that you want to know everything that an individual target said. Presumably you would only spin it up long enough to copy the audio to a work drive.

I would imagine that delays of a few hours to retrieve a person’s “Life log” would be acceptable in most contexts.

Nick P May 8, 2013 9:08 PM

@ Carpe

“Only correction I have would be regarding the use of the data. I would argue that the goal is retroactive punishment via data that can be used to “walk the cat back”. Essentially, dissent too much too successfully, and be maliciously targeted. Later down the road predictive arrests are also likely.”

I agree. I think it will be used for many purposes, including that one. Honestly, though, I don’t know their current MO on that and what exactly they’re tapping. So, I can’t really factor it more specifically into my analyses without venturing into wild guesses. I just left specifics out, as the system I proposed can still do that stuff.

@ Bruce Clement

As suggested in my post, they might also keep metadata of their own about what they’ve collected. If they fill disks until full, then move to the next, they can power down a disk until it’s needed. They’d just need one or two servers keeping track. I think they need that anyway for other reasons, so why not put power reduction strategies in their management systems. One of the main advantages HD’s have over RAM is that they can be shut down. Why not use it, eh?

Clive Robinson May 9, 2013 2:14 AM

@ Bruce Clement,

With regards HDs in my above posts I’ve indicated that I was assuming the data on them would be reduced with time.

In fact for most calls I would expect this to happen very very quickly unless made by subjects of interest.

In fact I susspect less than one in a million calls would ever be subject to anything other than front end processing to text and archiving. The only question being the time designated for the keeping of actuall audio of the calls.

Which is why I went on to describe the HDs as little more than a buffer between RAM used for processing and a large tape archive.

MSB May 9, 2013 2:58 AM

Even if they don’t have the capacity to monitor and store every call they certainly do have enough capacity for people who have recently been target of an investigation triggered by a request made by Russian intelligence agencies. Tamerlan wasn’t some random guy. And he was no US citizen.

Vles May 9, 2013 4:13 AM

Is the U.S. Government Recording and Saving All Domestic Telephone Calls?

Surprised no one’s brought out Occam’s razor..
Anyway, how many data centers do they operate? Surely these are not charged with producing new sudoku level 9 puzzles…

kashmarek May 9, 2013 6:01 AM

Lets add a side track to this. I recently went in to get my hair cut, having let it grow too long after the winter. At the beginning of the “session”, an attendent at the salon placed a small tablet like device on the counter just below the mirror. I suspect it may have been somthing that recorded or tracked the cut time, but what else. What if this device were recording audio, and possibly video? In my “old” (now out of business) barber shop, the chair would get swiveled several times during the cut, and mostly it was turned toward the ceiling mounted TV. In this new salon, the chair did not get moved and was always facing the mirror.

We know that many business establishments make a routine practice of video recording of the premises and individual stations as well as parking lots etc. Where else have you observed some “unusual” type of such activity, where it had not been in place before? Fast food resturants, laundromats, car wash, doctor’s waiting room…?

Think about it. It isn’t just the government, but austensibly that data will be “available” to the government.

Dirk Praet May 9, 2013 6:49 AM

@ kashmarek

Recording people while having a haircut definitely makes sense. I know a lot of people that have been zealously trying to destroy all 80’s pictures of themselves for fear that they would undermine, if not totally destroy, their credibility or authority in the positions they are currently holding. Never underestimate the power of ridicule.

Demen Ted May 9, 2013 6:53 PM

Warrantless blanket eavesdropping on US civilians is an outrage, for sure.

What about using eavesdropping technologies to help US corporations win contracts in competition with foreign corporations, like Boeing vs Airbus?
Is this constitutional?
Is this usage mandated by the congress or the senate?
Are national security best served by this (mis)usage of intelligence resources?
Is this how we promote free trade and fair competition?
Who is accountable for these decisions?

Buck May 9, 2013 11:49 PM

@MSB: I’m very surprised I had to scroll down so far to see someone mention this! Surely a tip from Russian intelligence would warrant a targeted individual tap on Tamerlan, as well as his family and close acquaintances… I suspect that is what Mr. Clemente of the FBI is referring to. (The NSA is probably a different story, but that is really neither here nor there.)

I’m also very surprised by the number of commenters here who have suggested text-to-speech as a viable option, though I suspect this might be some sort of ‘buzz word’ in Silicon Valley right now… Even if I assume that law enforcement’s capabilities have far surpassed those of my bank’s automated voice menu system (which still works about as poorly for me as it did in 2006), it still sounds like a hopeless endeavor. For argument’s sake, let’s say we do have 100% accuracy for speech to text. How does one ascribe metadata for code speak? Inside jokes? Sarcasm? Good luck!

In reality (barring some spectacular developments in brain decoding), it sounds like we’re going to need millions of people to monitor trillions of phone calls. Actually, hold up here- that might not be a half bad idea 😉 Plenty of citizens are constantly searching for jobs that just don’t exist, and I’d be willing to bet there are more than quite a few seniors who are seeking to retain usefulness after retirement…

I wonder about how the evidence from today’s surveillance tools are used in our courtrooms… Yes, we’ve all seen the grainy photos of the Boston bombers from commercial security cameras, but what about the rest of the data? Will cell tower dumps be used as evidence to place the suspects at the scene of the crime? Who gets to analyze these massive datasets? I suspect the FBI would have all the time they need before prosecution to draw whatever conclusions they so desire… What about the defense? How much time during discovery will they have for analysis and how many data experts can they afford? Will they spot surreptitious individuals carrying concealed weapons closely behind our suspects? What about out of the ordinary contacts initiated to the suspects prior to the events? Could there be lurking evidence of blackmail, coercion, or a downright sinister plot?

Unfortunately, we’ll probably never know… Our constitution does not call for the right to an un-speedy and thorough trial with consideration for the enormous quantities of evidence in today’s technological society

However, the wave of the future will allow citizens to police citizens.
After all, absolute power corrupts absolutely!
Decentralized power is a way of providing quickly to the needs of those who need it most, while still retaining safeguards against those who seek to take advantage of all of us!

Clive Robinson May 10, 2013 4:25 AM

@ Buck,

For argument’s sake let’s say we do have 100% accuracy for speech to text. How does one ascribe metadata for code speak? Inside jokes? Sarcasm? Good luck!

You don’t need to.

Firstly have a look at how Google does language convertion, it does not use a dictionary as you or I would do. It uses the statistics of the data it has with regards the vast tract of various languages. In effect it translates sentances within the context of the document and other documents it has. The result is it does understand slang, jokes and various other nuances of day to day usage of a language.

The problem with people tryying to convay other information is that it forms styalised speach at variance to the norm. This changes it’s statistics just like any other form of steneography and there are known ways of detecting it as such.

As for speach to text not being 100% reliable it realy does not need to be better than around 50% to get usefull results when you are analysing not as words but sentances and paragraphs within a general context.

The reason being you need to turn the problem on it’s head, they are not looking to find a needle in the hay, but how to identify hay so it can be thrown away. And this can be done by successive filters. Thus the filters can be very crude rejecting only one or two percent at each stage but still producing high quality results (to see why go and look up how a uranium enrichment process works where each centrefuge only provides a fractional percentage of enrichment).

Dan May 10, 2013 8:47 PM

Quoting from the final paragraph of Schneier’s blog “I’m very skeptical about Clemente’s comments. He left the FBI shortly after 9/11, and he didn’t have any special security clearances.”

Why do you believe that you would know if this fellow (Clemente) had any special security clearances?

Gulfie May 11, 2013 4:31 PM

Recap :

How much does an average person talk on the phone a day? Assume 1 hour, 3600 seconds.

8KB/sec for an hour is ~30 MB

300 Million americans * ~30 MB = 9000 million million bytes ~ 9 Petabytes / day uncompressed.

9 Petabyte/ day

Given only slightly custom hardware : , $60k / PB .

9 PB/day * $60k / PB = $540,000 / day (purchase cost)

Half a million dollars seems like a lot of money, but it is less than a 4 hour flight in a B-2a St
elth bomber and the DOD does that kinda of thing every day.. ( )

As for choosing and picking what to listen to, as mentioned before there are already databases that record all the calls… they are called billing databases. Duplicating the feeds and upsizing t
he storage was done a while ago. ( )

The tape version is even simpler, IBM sells it. Just get one of these IBM TS3500 Tape Libraries. Max system capacity at 3:1 compression, 2700 PB, or about 10 years worth of storage ( ) . ( I have not requested a quote ).

9 PB / day is not that much anymore. 9 PB /Day averages out to ~100 GB/sec ( 1TBit ). Even undersea cables are able to do 1 Tbit. (

Aggregation and capture of data? I believe that AT&T room 641A answers that question. Yes, and yes. ( ) Legal indemnification of Telcos keeps cropping up. (×2953093 ) If they didn’t need it, they wouldn’t be getting it.

Totally doable, almost off the shelf at this point. Trapping the traffic… totally possible, legally muddy. There is nothing to contradict the guy’s story.

TvF May 13, 2013 4:31 AM

They won’t be recording ALL calls. I assume they will be more economical about that:
* all burner phones
* all phones registered to foreign nationals
* all phones registered to political/religious activists
* all phones registered to companies and organisations related to politics, defense, explosives, weapons, chemicals etc. (incl. their employees and the employees of direct contractors)

In that case, we arrive at a volume of about 1/5th to 1/10th of the total volume. And that should be manageable.

emk May 14, 2013 12:18 AM

They may well record everything. After all the political, legal, organizational, even tech effort required to collect a significant subset may not be that different that required to collect most/all.

As to budget this would most likely be done across multiple agencies not just NSA. This would allow the everybody does it” defence against future constitutionally minded Congresses or Presidents. This would also predispose towards collecting all data, since CIA,DOD,Secret Service,FBI,DEA,FinCen etc would all have differing requirements.

Its axiomatic that the telcos would give maximum cooperation.

Others have done more technical storage requirement analysis but here is mine:

  • Assume 300 million people talking to each other on phones daily
  • Assume each call involves two people
  • Assume they each spend 100 min per day on the phone
  • Thats 150 million, 100 min conversations. Add in an allowance of 50 million, 100 min conversations with answering machines and automated banking systems etc
  • Thats 200 mil, 100 min coversations. Or 20 Billion minutes per day that the NSA would have to record and store.
  • Assume that each minute can be compressed down to 100K
  • Thats 2 Billion Megabytes of data per day
  • Or 2 Petabytes per day of storage

I’m not an expert but it seems this would be easily doable.
DOD/NSA probably have their own national backbone and if they hooked it up to the civillian backbone at say 5 choke points per telco, you are looking at about 30-50 collection points max to get all the data. I don’t think its that hard.

BoringPerson May 15, 2013 11:40 AM

Hmm, all this talk of technology may be a bit overblown.

Having worked a bit with govt agencies I find they greatly prefer the easy route. For example, they quietly ask for backdoors into electronic kit whereby encryption keys can be simply retrieved … no banks of decryption computers needed.

The reward for this manufacturer support? One specific case I came across: Simplified and very fast award of export licences, obtained via a phone call rather than via reams of paperwork and endless civil service delays.

Also, don’t forget that the spooks are polite, friendly, often ex-military types who you can have a pleasant chat and cup of coffee with. They are on OUR side. They are ‘friends’. Of course you will help them if they ask for a small favour – why not?

Also, even if the baddies use very clever deception & avoidance techniques something trivial will often catch them out : that pizza order made on a ‘clean’ phone, that speed camera photo, that nosey neighbour who wonders why you have five mobile phones.

Unless you are a loner who detonates a bomb from parts bought on a random unpremeditated impulse that same day from the local hardware store, you stand a very real chance of detection before any crime.

JL May 15, 2013 6:23 PM

I can’t believe no one has mentioned the Wired article from March 2012 about the Utah data center.

Since reading that article, I’ve just sort of assumed that if this isn’t already happening, it will be soon. To build on an earlier comment about Occam’s Razor: do you think that if the NSA could do this, it would? Well, there’s your answer.

AJK June 5, 2013 4:45 PM

Given the fragmented nature of Telcom and the point-to-point encryption options out there it is impossible to gather EVERYTHING.

I think it’s more reasonable to collect Feeds (facebook) and run queries on server farms like yahoo mail and google.

Legality is a totally separate issue.

Alex is right on the fragmented nature of voice.

@Cliver SMS? Yes, it would not be impossible for the major carriers to feed at government data center every message. Having 100% of any particular carrier without dealing with Speech-to-Text seems like a better approach than local voice.

What the Government needs is a private company that know where you are and what you are doing. This way the government can just look at the companies data instead of housing it themselves. Oh wait, these companies already exist.


I’m sure

Clive Robinson June 6, 2013 5:55 AM

@ AJK,

You might want to read what the UK’s Guardian News Paper has released today. Basicaly they have obtained copies of secret US (FISA) court orders showing that a major US phone supplier (Verizon) had to hand over records on a daily basis to the FBI&NSA all “metadat” for ALL phone calls, which apparently includes all cell tower and GPS location data…

Interestingly it appears that some modern research speach encoders can get down to 45baud average data rate with reasonable inteligability but with high latency and CPU load. If such systems are known to the likes of the NSA, and as they are supposed to have a 5-15year lead on such developments we can guess they probably are, but also more importantly have been made more “efficient” then recording the speach content and voice print identification of the speaker are not just possible but fairly routien.

However all of this still remains “front end” issues, it’s the storage organisation such that it is readily searchable electronicaly which becomes the major problem for intel gathering.

We can certainly see that both the FBI and NSA want the metadata including location data to build “contact” and “probable contact” lists along side those with “suspicious activity” such as websites visited or the use of TOR and Encryption. The chances are they will also want to build “exception” lists where data from say payment cards (or even RFIDs in cloathing etc) do not agree with other location data, thus tagging “suspicious behaviour” and potentialy identifing who has a “burnner phone” in their pocket…

Being “ID Clean” to maintain a “legend” is difficult for even the most experianced agents and would aproach impossible with even a small incease in surveillance technology such as “door frame NFC” readers that some store chains are seriously considering to get targeted advertising etc. Thus the likes of Walmart could be the organisations you are thinking of.

noname June 18, 2013 1:34 PM

Wrong question: “Is the U.S. Government Recording and Saving All Domestic Telephone Calls?” – delete “domestic”.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.