NSA Eavesdropping
This is the line that’s done best for me on the radio: “The NSA would like to remind everyone to call their mothers this Sunday. They need to calibrate their system.”
Page 55 of 58
This is the line that’s done best for me on the radio: “The NSA would like to remind everyone to call their mothers this Sunday. They need to calibrate their system.”
Interesting first-person account of someone on the U.S. Terrorist Watch List:
To sum up, if you run afoul of the nation’s “national security” apparatus, you’re completely on your own. There are no firm rules, no case law, no real appeals processes, no normal array of Constitutional rights, no lawyers to help, and generally none of the other things that we as American citizens expect to be able to fall back on when we’ve been (justly or unjustly) identified by the government as wrong-doers.
Technology Review has an interesting article discussing some of the technologies used by the NSA in its warrantless wiretapping program, some of them from the killed Total Information Awareness (TIA) program.
Washington’s lawmakers ostensibly killed the TIA project in Section 8131 of the Department of Defense Appropriations Act for fiscal 2004. But legislators wrote a classified annex to that document which preserved funding for TIA’s component technologies, if they were transferred to other government agencies, say sources who have seen the document, according to reports first published in The National Journal. Congress did stipulate that those technologies should only be used for military or foreign intelligence purposes against non-U.S. citizens. Still, while those component projects’ names were changed, their funding remained intact, sometimes under the same contracts.
Thus, two principal components of the overall TIA project have migrated to the Advanced Research and Development Activity (ARDA), which is housed somewhere among the 60-odd buildings of “Crypto City,” as NSA headquarters in Fort Meade, MD, is nicknamed. One of the TIA components that ARDA acquired, the Information Awareness Prototype System, was the core architecture that would have integrated all the information extraction, analysis, and dissemination tools developed under TIA. According to The National Journal, it was renamed “Basketball.” The other, Genoa II, used information technologies to help analysts and decision makers anticipate and pre-empt terrorist attacks. It was renamed “Topsail.”
Larry Beinhart makes an interesting case for the elimination of most government secrecy.
He has a good argument, although I think the issue is a bit more complicated.
Excellent blog post. Well worth reading.
EDITED TO ADD (4/25): Stuntz responds to Solove’s post. And Solove responds to Stuntz’s response to Solove’s response to, um, to Stuntz’s original essay.
Last year, the Department of Homeland Security finally got around to appointing its DHS Data Privacy and Integrity Advisory Committee. It was mostly made up of industry insiders instead of anyone with any real privacy experience. (Lance Hoffman from George Washington University was the most notable exception.)
And now, we have something from that committee. On March 7th they published their “Framework for Privacy Analysis of Programs, Technologies, and Applications.”
This document sets forth a recommended framework for analyzing programs, technologies, and applications in light of their effects on privacy and related interests. It is intended as guidance for the Data Privacy and Integrity Advisory Committee (the Committee) to the U.S. Department of Homeland Security (DHS). It may also be useful to the DHS Privacy Office, other DHS components, and other governmental entities that are seeking to reconcile personal data-intensive programs and activities with important social and human values.
It’s surprisingly good.
I like that it is a series of questions a program manager has to answer: about the legal basis for the program, its efficacy against the threat, and its effects on privacy. I am particularly pleased that their questions on pages 3-4 are very similar to the “five steps” I wrote about in Beyond Fear. I am thrilled that the document takes a “trade-off” approach; the last question asks: “Should the program proceed? Do the benefits of the program…justify the costs to privacy interests….?”
I think this is a good starting place for any technology or program with respect to security and privacy. And I hope the DHS actually follows the recommendations in this report.
Really interesting article by Robert X. Cringely on the lack of federal funding for security technologies.
After the 9-11 terrorist attacks, the United States threw its considerable fortune into the War on Terror, of which a large component was Homeland Security. We conducted a couple wars abroad, both of which still seem to be going on, and took a vast domestic security bureaucracy and turned it into a different and even more vast domestic security bureaucracy. We could argue all day about whether or not America is more secure as a result of these changes, but we’d all agree that a lot of money has been spent. In fact, from a pragmatic point of view, ALL the money has been spent, and that’s the point of this particular column. For a variety of reasons, there is no money left to spend on homeland security none, nada, zilch. We’re busted.
I think his assessment is spot on.
In the post 9/11 world, there’s much focus on connecting the dots. Many believe that data mining is the crystal ball that will enable us to uncover future terrorist plots. But even in the most wildly optimistic projections, data mining isn’t tenable for that purpose. We’re not trading privacy for security; we’re giving up privacy and getting no security in return.
Most people first learned about data mining in November 2002, when news broke about a massive government data mining program called Total Information Awareness. The basic idea was as audacious as it was repellent: suck up as much data as possible about everyone, sift through it with massive computers, and investigate patterns that might indicate terrorist plots. Americans across the political spectrum denounced the program, and in September 2003, Congress eliminated its funding and closed its offices.
But TIA didn’t die. According to The National Journal, it just changed its name and moved inside the Defense Department.
This shouldn’t be a surprise. In May 2004, the General Accounting Office published a report that listed 122 different federal government data mining programs that used people’s personal information. This list didn’t include classified programs, like the NSA’s eavesdropping effort, or state-run programs like MATRIX.
The promise of data mining is compelling, and convinces many. But it’s wrong. We’re not going to find terrorist plots through systems like this, and we’re going to waste valuable resources chasing down false alarms. To understand why, we have to look at the economics of the system.
Security is always a trade-off, and for a system to be worthwhile, the advantages have to be greater than the disadvantages. A national security data mining program is going to find some percentage of real attacks, and some percentage of false alarms. If the benefits of finding and stopping those attacks outweigh the cost—in money, liberties, etc.—then the system is a good one. If not, then you’d be better off spending that cost elsewhere.
Data mining works best when there’s a well-defined profile you’re searching for, a reasonable number of attacks per year, and a low cost of false alarms. Credit card fraud is one of data mining’s success stories: all credit card companies data mine their transaction databases, looking for spending patterns that indicate a stolen card. Many credit card thieves share a pattern—purchase expensive luxury goods, purchase things that can be easily fenced, etc.—and data mining systems can minimize the losses in many cases by shutting down the card. In addition, the cost of false alarms is only a phone call to the cardholder asking him to verify a couple of purchases. The cardholders don’t even resent these phone calls—as long as they’re infrequent—so the cost is just a few minutes of operator time.
Terrorist plots are different. There is no well-defined profile, and attacks are very rare. Taken together, these facts mean that data mining systems won’t uncover any terrorist plots until they are very accurate, and that even very accurate systems will be so flooded with false alarms that they will be useless.
All data mining systems fail in two different ways: false positives and false negatives. A false positive is when the system identifies a terrorist plot that really isn’t one. A false negative is when the system misses an actual terrorist plot. Depending on how you “tune” your detection algorithms, you can err on one side or the other: you can increase the number of false positives to ensure that you are less likely to miss an actual terrorist plot, or you can reduce the number of false positives at the expense of missing terrorist plots.
To reduce both those numbers, you need a well-defined profile. And that’s a problem when it comes to terrorism. In hindsight, it was really easy to connect the 9/11 dots and point to the warning signs, but it’s much harder before the fact. Certainly, there are common warning signs that many terrorist plots share, but each is unique, as well. The better you can define what you’re looking for, the better your results will be. Data mining for terrorist plots is going to be sloppy, and it’s going to be hard to find anything useful.
Data mining is like searching for a needle in a haystack. There are 900 million credit cards in circulation in the United States. According to the FTC September 2003 Identity Theft Survey Report, about 1% (10 million) cards are stolen and fraudulently used each year. Terrorism is different. There are trillions of connections between people and events—things that the data mining system will have to “look at”—and very few plots. This rarity makes even accurate identification systems useless.
Let’s look at some numbers. We’ll be optimistic. We’ll assume the system has a 1 in 100 false positive rate (99% accurate), and a 1 in 1,000 false negative rate (99.9% accurate).
Assume one trillion possible indicators to sift through: that’s about ten events—e-mails, phone calls, purchases, web surfings, whatever—per person in the U.S. per day. Also assume that 10 of them are actually terrorists plotting.
This unrealistically-accurate system will generate one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999% and you’re still chasing 2,750 false alarms per day—but that will inevitably raise your false negatives, and you’re going to miss some of those ten real plots.
This isn’t anything new. In statistics, it’s called the “base rate fallacy,” and it applies in other domains as well. For example, even highly accurate medical tests are useless as diagnostic tools if the incidence of the disease is rare in the general population. Terrorist attacks are also rare, any “test” is going to result in an endless stream of false alarms.
This is exactly the sort of thing we saw with the NSA’s eavesdropping program: the New York Times reported that the computers spat out thousands of tips per month. Every one of them turned out to be a false alarm.
And the cost was enormous: not just the cost of the FBI agents running around chasing dead-end leads instead of doing things that might actually make us safer, but also the cost in civil liberties. The fundamental freedoms that make our country the envy of the world are valuable, and not something that we should throw away lightly.
Data mining can work. It helps Visa keep the costs of fraud down, just as it helps Amazon.com show me books that I might want to buy, and Google show me advertising I’m more likely to be interested in. But these are all instances where the cost of false positives is low—a phone call from a Visa operator, or an uninteresting ad—and in systems that have value even if there is a high number of false negatives.
Finding terrorism plots is not a problem that lends itself to data mining. It’s a needle-in-a-haystack problem, and throwing more hay on the pile doesn’t make that problem any easier. We’d be far better off putting people in charge of investigating potential plots and letting them direct the computers, instead of putting the computers in charge and letting them decide who should be investigated.
This essay originally appeared on Wired.com.
I like this idea:
I had to sign a tedious business contract the other day. They wanted my corporation number—fair enough—plus my Social Security number—well, if you insist—and also my driver’s license number—hang on, what’s the deal with that?
Well, we e-mailed over a query and they e-mailed back that it was a requirement of the Patriot Act. So we asked where exactly in the Patriot Act could this particular requirement be found and, after a bit of a delay, we got an answer.
And on discovering that there was no mention of driver’s licenses in that particular subsection, I wrote back that we have a policy of reporting all erroneous invocations of the Patriot Act to the Department of Homeland Security on the grounds that such invocations weaken the rationale for the act, and thereby undermine public support for genuine anti-terrorism measures and thus constitute a threat to America’s national security.
And about 10 minutes after that the guy sent back an e-mail saying he didn’t need the driver’s license number after all.
This is a pretty good commentary on the issue.
(I’ve said in the past that the real security problem here is the transparency of the process.)
Sidebar photo of Bruce Schneier by Joe MacInnis.