What the NSA Can and Cannot Do
Good summary from the London Review of Books.
Page 37 of 56
Good summary from the London Review of Books.
Jack Goldsmith argues that we need the NSA to surveil the Internet not for terrorism reasons, but for cyberespionage and cybercrime reasons. Daniel Gallington argues—the headline has nothing to do with the content—that the balance between surveillance and privacy is about right.
Here’s a demonstration of the US government’s capabilities to monitor the public Internet. Former CIA and NSA Director Michael Hayden was on the Acela train between New York and Washington DC, taking press interviews on the phone. Someone nearby overheard the conversation, and started tweeting about it. Within 15 or so minutes, someone somewhere noticed the tweets, and informed someone who knew Hayden. That person called Hayden on his cell phone and, presumably, told him to shut up.
Nothing covert here; the tweets were public. But still, wow.
EDITED TO ADD: To clarify, I don’t think this was a result of the NSA monitoring the Internet. I think this was some public relations office—probably the one that is helping General Alexander respond to all the Snowden stories—who is searching the public Twitter feed for, among other things, Hayden’s name.
This is from a Snowden document released by Le Monde:
General Term Descriptions:
HIGHLANDS: Collection from Implants
VAGRANT: Collection of Computer Screens
MAGNETIC: Sensor Collection of Magnetic Emanations
MINERALIZE: Collection from LAN Implant
OCEAN: Optical Collection System for Raster-Based Computer Screens
LIFESAFER: Imaging of the Hard Drive
GENIE: Multi-stage operation: jumping the airgap etc.
BLACKHEART: Collection from an FBI Implant
[…]
DROPMIRE: Passive collection of emanations using antenna
CUSTOMS: Customs opportunities (not LIFESAVER)
DROPMIRE: Laser printer collection, purely proximal access (***NOT*** implanted)
DEWSWEEPER: USB (Universal Serial Bus) hardware host tap that provides COVERT link over US link into a target network. Operates w/RF relay subsystem to provide wireless Bridge into target network.
RADON: Bi-directional host tap that can inject Ethernet packets onto the same targets. Allows bi-directional exploitation of denied networks using standard on-net tools.
There’s a lot to think about in this list. RADON and DEWSWEEPER seem particularly interesting.
We already know the NSA wants to eavesdrop on the Internet. It has secret agreements with telcos to get direct access to bulk Internet traffic. It has massive systems like TUMULT, TURMOIL, and TURBULENCE to sift through it all. And it can identify ciphertext—encrypted information—and figure out which programs could have created it.
But what the NSA wants is to be able to read that encrypted information in as close to real-time as possible. It wants backdoors, just like the cybercriminals and less benevolent governments do.
And we have to figure out how to make it harder for them, or anyone else, to insert those backdoors.
How the NSA Gets Its Backdoors
The FBI tried to get backdoor access embedded in an AT&T secure telephone system in the mid-1990s. The Clipper Chip included something called a LEAF: a Law Enforcement Access Field. It was the key used to encrypt the phone conversation, itself encrypted in a special key known to the FBI, and it was transmitted along with the phone conversation. An FBI eavesdropper could intercept the LEAF and decrypt it, then use the data to eavesdrop on the phone call.
But the Clipper Chip faced severe backlash, and became defunct a few years after being announced.
Having lost that public battle, the NSA decided to get its backdoors through subterfuge: by asking nicely, pressuring, threatening, bribing, or mandating through secret order. The general name for this program is BULLRUN.
Defending against these attacks is difficult. We know from subliminal channel and kleptography research that it’s pretty much impossible to guarantee that a complex piece of software isn’t leaking secret information. We know from Ken Thompson’s famous talk on “trusting trust” (first delivered in the ACM Turing Award Lectures) that you can never be totally sure if there’s a security flaw in your software.
Since BULLRUN became public last month, the security community has been examining security flaws discovered over the past several years, looking for signs of deliberate tampering. The Debian random number flaw was probably not deliberate, but the 2003 Linux security vulnerability probably was. The DUAL_EC_DRBG random number generator may or may not have been a backdoor. The SSL 2.0 flaw was probably an honest mistake. The GSM A5/1 encryption algorithm was almost certainly deliberately weakened. All the common RSA moduli out there in the wild: we don’t know. Microsoft’s _NSAKEY looks like a smoking gun, but honestly, we don’t know.
How the NSA Designs Backdoors
While a separate program that sends our data to some IP address somewhere is certainly how any hacker—from the lowliest script kiddie up to the NSA—spies on our computers, it’s too labor-intensive to work in the general case.
For government eavesdroppers like the NSA, subtlety is critical. In particular, three characteristics are important:
These characteristics imply several things:
Design Strategies for Defending against Backdoors
With these principles in mind, we can list design strategies. None of them is foolproof, but they are all useful. I’m sure there’s more; this list isn’t meant to be exhaustive, nor the final word on the topic. It’s simply a starting place for discussion. But it won’t work unless customers start demanding software with this sort of transparency.
This is a hard problem. We don’t have any technical controls that protect users from the authors of their software.
And the current state of software makes the problem even harder: Modern apps chatter endlessly on the Internet, providing noise and cover for covert communications. Feature bloat provides a greater “attack surface” for anyone wanting to install a backdoor.
In general, what we need is assurance: methodologies for ensuring that a piece of software does what it’s supposed to do and nothing more. Unfortunately, we’re terrible at this. Even worse, there’s not a lot of practical research in this area—and it’s hurting us badly right now.
Yes, we need legal prohibitions against the NSA trying to subvert authors and deliberately weaken cryptography. But this isn’t just about the NSA, and legal controls won’t protect against those who don’t follow the law and ignore international agreements. We need to make their job harder by increasing their risk of discovery. Against a risk-averse adversary, it might be good enough.
This essay previously appeared on Wired.com.
EDITED TO ADD: I am looking for other examples of known or plausible instances of intentional vulnerabilities for a paper I am writing on this topic. If you can think of an example, please post a description and reference in the comments below. Please explain why you think the vulnerability could be intentional. Thank you.
Historically, surveillance was difficult and expensive.
Over the decades, as technology advanced, surveillance became easier and easier. Today, we find ourselves in a world of ubiquitous surveillance, where everything is collected, saved, searched, correlated and analyzed.
But while technology allowed for an increase in both corporate and government surveillance, the private and public sectors took very different paths to get there. The former always collected information about everyone, but over time, collected more and more of it, while the latter always collected maximal information, but over time, collected it on more and more people.
Corporate surveillance has been on a path from minimal to maximal information. Corporations always collected information on everyone they could, but in the past they didn’t collect very much of it and only held it as long as necessary. When surveillance information was expensive to collect and store, companies made do with as little as possible.
Telephone companies collected long-distance calling information because they needed it for billing purposes. Credit cards collected only the information about their customers’ transactions that they needed for billing. Stores hardly ever collected information about their customers, maybe some personal preferences, or name-and-address for advertising purposes. Even Google, back in the beginning, collected far less information about its users than it does today.
As technology improved, corporations were able to collect more. As the cost of data storage became cheaper, they were able to save more data and for a longer time. And as big data analysis tools became more powerful, it became profitable to save more. Today, almost everything is being saved by someone—probably forever.
Examples are everywhere. Internet companies like Google, Facebook, Amazon and Apple collect everything we do online at their sites. Third-party cookies allow those companies, and others, to collect data on us wherever we are on the Internet. Store affinity cards allow merchants to track our purchases. CCTV and aerial surveillance combined with automatic face recognition allow companies to track our movements; so does your cell phone. The Internet will facilitate even more surveillance, by more corporations for more purposes.
On the government side, surveillance has been on a path from individually targeted to broadly collected. When surveillance was manual and expensive, it could only be justified in extreme cases. The warrant process limited police surveillance, and resource restraints and the risk of discovery limited national intelligence surveillance. Specific individuals were targeted for surveillance, and maximal information was collected on them alone.
As technology improved, the government was able to implement ever-broadening surveillance. The National Security Agency could surveil groups—the Soviet government, the Chinese diplomatic corps, etc.—not just individuals. Eventually, they could spy on entire communications trunks.
Now, instead of watching one person, the NSA can monitor “three hops” away from that person—an ever widening network of people not directly connected to the person under surveillance. Using sophisticated tools, the NSA can surveil broad swaths of the Internet and phone network.
Governments have always used their authority to piggyback on corporate surveillance. Why should they go through the trouble of developing their own surveillance programs when they could just ask corporations for the data? For example we just learned that the NSA collects e-mail, IM and social networking contact lists for millions of Internet users worldwide.
But as corporations started collecting more information on populations, governments started demanding that data. Through National Security Letters, the FBI can surveil huge groups of people without obtaining a warrant. Through secret agreements, the NSA can monitor the entire Internet and telephone networks.
This is a huge part of the public-private surveillance partnership.
The result of all this is we’re now living in a world where both corporations and governments have us all under pretty much constant surveillance.
Data is a byproduct of the information society. Every interaction we have with a computer creates a transaction record, and we interact with computers hundreds of times a day. Even if we don’t use a computer—buying something in person with cash, say—the merchant uses a computer, and the data flows into the same system. Everything we do leaves a data shadow, and that shadow is constantly under surveillance.
Data is also a byproduct of information society socialization, whether it be e-mail, instant messages or conversations on Facebook. Conversations that used to be ephemeral are now recorded, and we are all leaving digital footprints wherever we go.
Moore’s law has made computing cheaper. All of us have made computing ubiquitous. And because computing produces data, and that data equals surveillance, we have created a world of ubiquitous surveillance.
Now we need to figure out what to do about it. This is more than reining in the NSA or fining a corporation for the occasional data abuse. We need to decide whether our data is a shared societal resource, a part of us that is inherently ours by right, or a private good to be bought and sold.
Writing in the Guardian, Chris Huhn said that “information is power, and the necessary corollary is that privacy is freedom.” How this interplay between power and freedom play out in the information age is still to be determined.
This essay previously appeared on CNN.com.
EDITED TO ADD (11/14): Richard Stallman’s comments on the subject.
A new Snowden document shows that the NSA is harvesting contact lists—e-mail address books, IM buddy lists, etc.—from Google, Yahoo, Microsoft, Facebook, and others.
Unlike PRISM, this unnamed program collects the data from the Internet . This is similar to how the NSA identifies Tor users. They get direct access to the Internet backbone, either through secret agreements with companies like AT&T, or surreptitiously, by doing things like tapping undersea cables. Once they have the data, they have powerful packet inspectors—code names include TUMULT, TURBULENCE, and TURMOIL—that run a bunch of different identification and copying systems. One of them, code name unknown, searches for these contact lists and copies them. Google, Yahoo, Microsoft, etc., have no idea that this is happening, nor have they consented to their data being harvested in this way.
These contact lists provide the NSA with the same sort of broad surveillance that the Verizon (and others) phone-record “metadata” collection programs provide: information about who are our friends, lovers, confidants, associates. This is incredibly intimate information, all collected without any warrant or due process. Metadata equals surveillance; always remember that.
The quantities are interesting:
During a single day last year, the NSA’s Special Source Operations branch collected 444,743 e-mail address books from Yahoo, 105,068 from Hotmail, 82,857 from Facebook, 33,697 from Gmail and 22,881 from unspecified other providers….
Note that Gmail, which uses SSL by default, provides the NSA with much less data than Yahoo, which doesn’t, despite the fact that Gmail has many more users than Yahoo does. (It’s actually kind of amazing how small that Gmail number is.) This implies that, despite BULLRUN, encryption works. Ubiquitous use of SSL can foil NSA eavesdropping. This is the same lesson we learned from the NSA’s attempts to break Tor: encryption works.
In response to this story, Yahoo has finally decided to enable SSL by default: by January 2014.
One more amusing bit: the NSA has a spam problem.
Spam has proven to be a significant problem for the NSA—clogging databases with information that holds no foreign intelligence value. The majority of all e-mails, one NSA document says, “are SPAM from ‘fake addresses and never ‘delivered’ to targets.”
The Washington Post published three NSA documents to support this article.
EDITED TO ADD: The New York Times makes this observation:
Spokesmen for the eavesdropping organizations reassured The Post that we shouldn’t bother our heads with all of this. They have “checks and balances built into our tools,” said one intelligence official.
Since the Snowden leaks began, the administration has adopted an interesting definition of that term. It used to be that “checks and balances” referred to one branch of the government checking and balancing the other branches—like the Supreme Court deciding whether laws are constitutional.
Now the N.S.A., the C.I.A. and the White House use the term to refer to a secret organization reviewing the actions it has taken and deciding in secret by itself whether they were legal and constitutional.
In one of the documents recently released by the NSA as a result of an EFF lawsuit, there’s discussion of a specific capability of a call records database to identify disposable “burner” phones.
Let’s consider, then, the very specific data this query tool was designed to return: The times and dates of the first and last call events, but apparently not the times and dates of calls between those endpoints. In other words, this tool is supporting analytic software that only cares when a phone went online, and when it stopped being used. It also gets the total number of calls, and the ratio of unique contacts to calls, but not the specific numbers contacted. Why, exactly, would this limited set of information be useful? And why, in particular, might you want to compare that information across a large number of phones there’s not yet any particular reason to suspect?
One possibility that jumps out at me—and perhaps anyone else who’s a fan of The Wire—is that this is the kind of information you would want if you were trying to identify disposable prepaid “burner” phones being used by a target who routinely cycles through cell phones as a countersurveillance tactic. The number of unique contacts and call/contact ratio would act as a kind of rough fingerprint—you’d assume a phone being used for dedicated clandestine purposes to be fairly consistent on that score—while the first/last call dates help build a timeline: You’re looking for a series of phones that are used for a standard amount of time, and then go dead just as the next phone goes online.
Consider this another illustration of the value of metadata.
As I recently reported in the Guardian, the NSA has secret servers on the Internet that hack into other computers, codename FOXACID. These servers provide an excellent demonstration of how the NSA approaches risk management, and exposes flaws in how the agency thinks about the secrecy of its own programs.
Here are the FOXACID basics: By the time the NSA tricks a target into visiting one of those servers, it already knows exactly who that target is, who wants him eavesdropped on, and the expected value of the data it hopes to receive. Based on that information, the server can automatically decide what exploit to serve the target, taking into account the risks associated with attacking the target, as well as the benefits of a successful attack. According to a top-secret operational procedures manual provided by Edward Snowden, an exploit named Validator might be the default, but the NSA has a variety of options. The documentation mentions United Rake, Peddle Cheap, Packet Wrench, and Beach Head—all delivered from a FOXACID subsystem called Ferret Cannon. Oh how I love some of these code names. (On the other hand, EGOTISTICALGIRAFFE has to be the dumbest code name ever.)
Snowden explained this to Guardian reporter Glenn Greenwald in Hong Kong. If the target is a high-value one, FOXACID might run a rare zero-day exploit that it developed or purchased. If the target is technically sophisticated, FOXACID might decide that there’s too much chance for discovery, and keeping the zero-day exploit a secret is more important. If the target is a low-value one, FOXACID might run an exploit that’s less valuable. If the target is low-value and technically sophisticated, FOXACID might even run an already-known vulnerability.
We know that the NSA receives advance warning from Microsoft of vulnerabilities that will soon be patched; there’s not much of a loss if an exploit based on that vulnerability is discovered. FOXACID has tiers of exploits it can run, and uses a complicated trade-off system to determine which one to run against any particular target.
This cost-benefit analysis doesn’t end at successful exploitation. According to Snowden, the TAO—that’s Tailored Access Operations—operators running the FOXACID system have a detailed flowchart, with tons of rules about when to stop. If something doesn’t work, stop. If they detect a PSP, a personal security product, stop. If anything goes weird, stop. This is how the NSA avoids detection, and also how it takes mid-level computer operators and turn them into what they call “cyberwarriors.” It’s not that they’re skilled hackers, it’s that the procedures do the work for them.
And they’re super cautious about what they do.
While the NSA excels at performing this cost-benefit analysis at the tactical level, it’s far less competent at doing the same thing at the policy level. The organization seems to be good enough at assessing the risk of discovery—for example, if the target of an intelligence-gathering effort discovers that effort—but to have completely ignored the risks of those efforts becoming front-page news.
It’s not just in the U.S., where newspapers are heavy with reports of the NSA spying on every Verizon customer, spying on domestic e-mail users, and secretly working to cripple commercial cryptography systems, but also around the world, most notably in Brazil, Belgium, and the European Union. All of these operations have caused significant blowback—for the NSA, for the U.S., and for the Internet as a whole.
The NSA spent decades operating in almost complete secrecy, but those days are over. As the corporate world learned years ago, secrets are hard to keep in the information age, and openness is a safer strategy. The tendency to classify everything means that the NSA won’t be able to sort what really needs to remain secret from everything else. The younger generation is more used to radical transparency than secrecy, and is less invested in the national security state. And whistleblowing is the civil disobedience of our time.
At this point, the NSA has to assume that all of its operations will become public, probably sooner than it would like. It has to start taking that into account when weighing the costs and benefits of those operations. And it now has to be just as cautious about new eavesdropping operations as it is about using FOXACID exploits attacks against users.
This essay previously appeared in the Atlantic.
This is a video of me talking about surveillance and privacy, both relating to the NSA and more generally.
EDITED TO ADD (10/13): YouTube link is more reliable.
Sidebar photo of Bruce Schneier by Joe MacInnis.