The Problems with Data Mining
Great op-ed in The New York Times on why the NSA’s data mining efforts won’t work, by Jonathan Farley, math professor at Harvard.
The simplest reason is that we’re all connected. Not in the Haight-Ashbury/Timothy Leary/late-period Beatles kind of way, but in the sense of the Kevin Bacon game. The sociologist Stanley Milgram made this clear in the 1960’s when he took pairs of people unknown to each other, separated by a continent, and asked one of the pair to send a package to the other—but only by passing the package to a person he knew, who could then send the package only to someone he knew, and so on. On average, it took only six mailings—the famous six degrees of separation—for the package to reach its intended destination.
Looked at this way, President Bush is only a few steps away from Osama bin Laden (in the 1970’s he ran a company partly financed by the American representative for one of the Qaeda leader’s brothers). And terrorist hermits like the Unabomber are connected to only a very few people. So much for finding the guilty by association.
A second problem with the spy agency’s apparent methodology lies in the way terrorist groups operate and what scientists call the “strength of weak ties.” As the military scientist Robert Spulak has described it to me, you might not see your college roommate for 10 years, but if he were to call you up and ask to stay in your apartment, you’d let him. This is the principle under which sleeper cells operate: there is no communication for years. Thus for the most dangerous threats, the links between nodes that the agency is looking for simply might not exist.
(This, by him, is also worth reading.)
Leave a comment