Entries Tagged "false negatives"

Page 4 of 4

Data Mining for Terrorists

In the post 9/11 world, there’s much focus on connecting the dots. Many believe that data mining is the crystal ball that will enable us to uncover future terrorist plots. But even in the most wildly optimistic projections, data mining isn’t tenable for that purpose. We’re not trading privacy for security; we’re giving up privacy and getting no security in return.

Most people first learned about data mining in November 2002, when news broke about a massive government data mining program called Total Information Awareness. The basic idea was as audacious as it was repellent: suck up as much data as possible about everyone, sift through it with massive computers, and investigate patterns that might indicate terrorist plots. Americans across the political spectrum denounced the program, and in September 2003, Congress eliminated its funding and closed its offices.

But TIA didn’t die. According to The National Journal, it just changed its name and moved inside the Defense Department.

This shouldn’t be a surprise. In May 2004, the General Accounting Office published a report that listed 122 different federal government data mining programs that used people’s personal information. This list didn’t include classified programs, like the NSA’s eavesdropping effort, or state-run programs like MATRIX.

The promise of data mining is compelling, and convinces many. But it’s wrong. We’re not going to find terrorist plots through systems like this, and we’re going to waste valuable resources chasing down false alarms. To understand why, we have to look at the economics of the system.

Security is always a trade-off, and for a system to be worthwhile, the advantages have to be greater than the disadvantages. A national security data mining program is going to find some percentage of real attacks, and some percentage of false alarms. If the benefits of finding and stopping those attacks outweigh the cost—in money, liberties, etc.—then the system is a good one. If not, then you’d be better off spending that cost elsewhere.

Data mining works best when there’s a well-defined profile you’re searching for, a reasonable number of attacks per year, and a low cost of false alarms. Credit card fraud is one of data mining’s success stories: all credit card companies data mine their transaction databases, looking for spending patterns that indicate a stolen card. Many credit card thieves share a pattern—purchase expensive luxury goods, purchase things that can be easily fenced, etc.—and data mining systems can minimize the losses in many cases by shutting down the card. In addition, the cost of false alarms is only a phone call to the cardholder asking him to verify a couple of purchases. The cardholders don’t even resent these phone calls—as long as they’re infrequent—so the cost is just a few minutes of operator time.

Terrorist plots are different. There is no well-defined profile, and attacks are very rare. Taken together, these facts mean that data mining systems won’t uncover any terrorist plots until they are very accurate, and that even very accurate systems will be so flooded with false alarms that they will be useless.

All data mining systems fail in two different ways: false positives and false negatives. A false positive is when the system identifies a terrorist plot that really isn’t one. A false negative is when the system misses an actual terrorist plot. Depending on how you “tune” your detection algorithms, you can err on one side or the other: you can increase the number of false positives to ensure that you are less likely to miss an actual terrorist plot, or you can reduce the number of false positives at the expense of missing terrorist plots.

To reduce both those numbers, you need a well-defined profile. And that’s a problem when it comes to terrorism. In hindsight, it was really easy to connect the 9/11 dots and point to the warning signs, but it’s much harder before the fact. Certainly, there are common warning signs that many terrorist plots share, but each is unique, as well. The better you can define what you’re looking for, the better your results will be. Data mining for terrorist plots is going to be sloppy, and it’s going to be hard to find anything useful.

Data mining is like searching for a needle in a haystack. There are 900 million credit cards in circulation in the United States. According to the FTC September 2003 Identity Theft Survey Report, about 1% (10 million) cards are stolen and fraudulently used each year. Terrorism is different. There are trillions of connections between people and events—things that the data mining system will have to “look at”—and very few plots. This rarity makes even accurate identification systems useless.

Let’s look at some numbers. We’ll be optimistic. We’ll assume the system has a 1 in 100 false positive rate (99% accurate), and a 1 in 1,000 false negative rate (99.9% accurate).

Assume one trillion possible indicators to sift through: that’s about ten events—e-mails, phone calls, purchases, web surfings, whatever—per person in the U.S. per day. Also assume that 10 of them are actually terrorists plotting.

This unrealistically-accurate system will generate one billion false alarms for every real terrorist plot it uncovers. Every day of every year, the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Raise that false-positive accuracy to an absurd 99.9999% and you’re still chasing 2,750 false alarms per day—but that will inevitably raise your false negatives, and you’re going to miss some of those ten real plots.

This isn’t anything new. In statistics, it’s called the “base rate fallacy,” and it applies in other domains as well. For example, even highly accurate medical tests are useless as diagnostic tools if the incidence of the disease is rare in the general population. Terrorist attacks are also rare, any “test” is going to result in an endless stream of false alarms.

This is exactly the sort of thing we saw with the NSA’s eavesdropping program: the New York Times reported that the computers spat out thousands of tips per month. Every one of them turned out to be a false alarm.

And the cost was enormous: not just the cost of the FBI agents running around chasing dead-end leads instead of doing things that might actually make us safer, but also the cost in civil liberties. The fundamental freedoms that make our country the envy of the world are valuable, and not something that we should throw away lightly.

Data mining can work. It helps Visa keep the costs of fraud down, just as it helps Amazon.com show me books that I might want to buy, and Google show me advertising I’m more likely to be interested in. But these are all instances where the cost of false positives is low—a phone call from a Visa operator, or an uninteresting ad—and in systems that have value even if there is a high number of false negatives.

Finding terrorism plots is not a problem that lends itself to data mining. It’s a needle-in-a-haystack problem, and throwing more hay on the pile doesn’t make that problem any easier. We’d be far better off putting people in charge of investigating potential plots and letting them direct the computers, instead of putting the computers in charge and letting them decide who should be investigated.

This essay originally appeared on Wired.com.

Posted on March 9, 2006 at 7:44 AMView Comments

Automatic Lie Detector

Coming soon to airports:

Tested in Russia, the two-stage GK-1 voice analyser requires that passengers don headphones at a console and answer “yes” or “no” into a microphone to questions about whether they are planning something illicit.

The software will almost always pick up uncontrollable tremors in the voice that give away liars or those with something to hide, say its designers at Israeli firm Nemesysco.

Fascinating.

In general, I prefer security systems that are invasive yet anonymous to ones that are based on massive databases. And automatic systems that divide people into a “probably fine” and “investigate a bit more” categories seem like a good use of technology. I have no idea whether this system works (there is a lot of evidence that it does not), what the false positive and false negative rates are (this article states a completely useless 12% false positive rate), or how easy it would be to learn how to fool the system, though. And in all of these trade-off discussions, the devil is in the details.

Posted on November 21, 2005 at 8:07 AMView Comments

The Emergence of a Global Infrastructure for Mass Registration and Surveillance

The International Campaign Against Mass Surveillance has issued a report (dated April 2005): “The Emergence of a Global Infrastructure for Mass Registration and Surveillance.” It’s a chilling assessment of the current international trends towards global surveillance. Most of it you will have seen before, although it’s good to have everything in one place. I am particularly pleased that the report explicitly states that these measures do not make us any safer, but only create the illusion of security.

The global surveillance initiatives that governments have embarked upon do not make us more secure. They create only the illusion of security.

Sifting through an ocean of information with a net of bias and faulty logic, they yield outrageous numbers of false positives ­ and false negatives. The dragnet approach might make the public feel that something is being done, but the dragnet is easily circumvented by determined terrorists who are either not known to authorities, or who use identity theft to evade them.

For the statistically large number of people that will be wrongly identified or wrongly assessed as a risk under the system, the consequences can be dire.

At the same time, the democratic institutions and protections, which would be the safeguards of individuals’ personal security, are being weakened. And national sovereignty and the ability of national governments to protect citizens against the actions of other states (when they are willing) are being compromised as security functions become more and more deeply integrated.

The global surveillance dragnet diverts crucial resources and efforts away from the kind of investments that would make people safer. What is required is good information about specific threats, not crude racial profiling and useless information on the nearly 100 percent of the population that poses no threat whatsoever.

Posted on April 29, 2005 at 8:54 AMView Comments

Failures of Airport Screening

According to the AP:

Security at American airports is no better under federal control than it was before the Sept. 11 attacks, a congressman says two government reports will conclude.

The Government Accountability Office, the investigative arm of Congress, and the Homeland Security Department’s inspector general are expected to release their findings soon on the performance of Transportation Security Administration screeners.

This finding will not surprise anyone who has flown recently. How does anyone expect competent security from screeners who don’t know the difference between books and books of matches? Only two books of matches are now allowed on flights; you can take as many reading books as you can carry.

The solution isn’t to privatize the screeners, just as the solution in 2001 wasn’t to make them federal employees. It’s a much more complex problem.

I wrote about it in Beyond Fear (pages 153-4):

No matter how much training they get, airport screeners routinely miss guns and knives packed in carry-on luggage. In part, that’s the result of human beings having developed the evolutionary survival skill of pattern matching: the ability to pick out patterns from masses of random visual data. Is that a ripe fruit on that tree? Is that a lion stalking quietly through the grass? We are so good at this that we see patterns in anything, even if they’re not really there: faces in inkblots, images in clouds, and trends in graphs of random data. Generating false positives helped us stay alive; maybe that wasn’t a lion that your ancestor saw, but it was better to be safe than sorry. Unfortunately, that survival skill also has a failure mode. As talented as we are at detecting patterns in random data, we are equally terrible at detecting exceptions in uniform data. The quality-control inspector at Spacely Sprockets, staring at a production line filled with identical sprockets looking for the one that is different, can’t do it. The brain quickly concludes that all the sprockets are the same, so there’s no point paying attention. Each new sprocket confirms the pattern. By the time an anomalous sprocket rolls off the assembly line, the brain simply doesn’t notice it. This psychological problem has been identified in inspectors of all kinds; people can’t remain alert to rare events, so they slip by.

The tendency for humans to view similar items as identical makes it clear why airport X-ray screening is so difficult. Weapons in baggage are rare, and the people studying the X-rays simply lose the ability to see the gun or knife. (And, at least before 9/11, there was enormous pressure to keep the lines moving rather than double-check bags.) Steps have been put in place to try to deal with this problem: requiring the X-ray screeners to take frequent breaks, artificially imposing the image of a weapon onto a normal bag in the screening system as a test, slipping a bag with a weapon into the system so that screeners learn it can happen and must expect it. Unfortunately, the results have not been very good.

This is an area where the eventual solution will be a combination of machine and human intelligence. Machines excel at detecting exceptions in uniform data, so it makes sense to have them do the boring repetitive tasks, eliminating many, many bags while having a human sort out the final details. Think about the sprocket quality-control inspector: If he sees 10,000 negatives, he’s going to stop seeing the positives. But if an automatic system shows him only 100 negatives for every positive, there’s a greater chance he’ll see them.

Paying the screeners more will attract a smarter class of worker, but it won’t solve the problem.

Posted on April 19, 2005 at 9:22 AMView Comments

Terrorism False Positives

Security systems fail in two different ways. The first is the obvious one: they fail to detect, stop, catch, or whatever, the bad guys. The second is more common, and often more important: they wrongly detect, stop, catch, or whatever, an innocent person. This story is from the New Zealand Herald:

A New Zealand resident who sent $5000 to his ill uncle in India had the money frozen for nearly a month because his name matched that of several men on a terrorist watch list.

Because there are far more innocent people than guilty ones, this second type of error is far more common than the first type. Security is always a trade-off, and when you’re trading off positives and negatives, you have to look at these sorts of things.

Posted on January 8, 2005 at 8:00 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.