Schneier on Security
A blog covering security and security technology.
« Kansas City Loses IRS Tapes |
| In Praise of Security Theater »
January 24, 2007
NSA Hiring Data Miners
Certainly looks that way:
The Algorithm Developer will work with massive amounts of inter-related data and develop and implement algorithms to search, sort and find patterns and hidden relationships in the data. The preferred candidate would be required to be able to work closely with Analysts to develop Rapid Operational Prototypes. The candidate would have the availability of existing algorithms as a model to begin.
Posted on January 24, 2007 at 2:57 PM
• 21 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
This kind of mining sounds dangerous. Looks like we'll need more canaries.
Unfortunately they'll be competing with Google for people in this field. (Or taking Google rejects.)
I'm pretty sure I can guess which one has the better working conditions salary and stock options....
The hard part might be finding a PhD in Mathematics near Ft. Meade that doesn't already work for the NSA. Or a current employee who wants to work on the other side of the fence.
Interesting use of capitalization in the ad, by the way.
In huge datasets you will probably find just about any relationship you care to look for. I wonder if a good understanding of the scientific method is a job requirement; or possibly just a working knowledge of the Dan Brown books :-)
More importantly, Bruce: why are you perusing the want ads? ;-)
If I were the NSA, I'd be advertising for all sorts of positions I don't actually need so I could throw people off of what I'm really up to.
"Help Wanted: Ph.D.s in quantum physics with published work in string theory."
"Help Wanted: Persons familiar with managing bars, clubs, and other commercial gathering places."
"Help Wanted: Psychics with at least three years' experience in crystal ball reading."
> "work closely with Analysts to develop Rapid Operational Prototypes"
This means anyone can tap you to extract data against some vague criteria. You are the search engine. Note that Operational and Prototype would never be used together in a sane environment.
> "availability of existing algorithms"
This means you have a disk full of SQL scripts that the previous hack wrote before finding something better to do.
The point is that datamining is the worst case scenario of government surveillance.
And if they're doing it without oversight, it's doubly bad and needs to be examined.
Considering that they haven't even explained the NSA domestic wiretap program to Jay Rockefeller's satisfaction, it's clear that things have gotten out of hand.
This is called "employee referral program" :)
Bruce Schneier doesn't have a PhD degree. Heh.
Perhaps they need a data miner just to find the right person for a different job
@Reader X "why are you perusing the want ads?"
For the same reason people are counting the number of pizzas delivered at the Pentagon: intelligence gathering.
@asdf: You forgot Bruce Schneier Fact #682183: Bruce has every degree available in every field of study. And then some.
"Unfortunately they'll be competing with Google for people in this field. (Or taking Google rejects.)
I'm pretty sure I can guess which one has the better working conditions salary and stock options.... "
I just have the vision of the look on the guys face in Mountain View the day he goes into the Boss' office with two unknown suits sitting there...
"Joe, have you ever heard of the term 'Employee Leasing'"
It was not until I'd had a graduate course in datamining that I truly appreciated Bruce's criticisms thereof. Datamining can only yield useful statistics, and can only benefit on a statistical level. It might, for example, help a retailer decide to cluster various items in a way that boosts revenue in candybar sales by 5%. It will not allow the retailer to prevent Alice from purchasing the cheap dog food and nothing more; it will not ensure Bob will grab a 6-pack of beer with his frozen pizza. It might increase early detection of various cancers via genetic screening, reducing the overall cancer fatality rate by 10%; it will not say, at birth, which of Carl, Dave, or Earl will develop cancer (although based on a vast collection of data including true positives and true negatives; a cost of screening, a cost of false positivies; a cost of false negatives; datamining might say that after the age of 30, Carl should be screened every 3 years; Dave every 10 years; and Early every 20 years).
The NSA does not have nearly the data needed to even make that kind of prediction. Plus, I'm guessing the NSA isn't looking to datamining for statistical reductions in terror-related fatalities; they want it to sift through data of 300 million Americans and say, "These 6 guys are plotting to blow up the Sears tower!". I'm also guessing they will pin all of their inevitable failures on, you guessed it, insufficient data due to cumbersome judicial oversight processes.
>The preferred candidate would be required to be able to work closely with Analysts to develop Rapid Operational Prototypes.
'cause hey, we don't care if you get it right, just do it fast ;)
> and stock options....
An issue I'd like to see addressed in public is the likelihood that our ever-expanding surveillance society coupled with effective data-mining will undermine the international business and investing foundations.
If our more clandestine govt. entities have resorted to triangulating with a sworn enemy to fund revolutionary ideals (Iran-Contra) what is the likelihood that these same organizations, as well as organizations with government sanctioned eavesdropping and data-mining abilities (AT&T, AOL, etc.), will not tap into this as a source of funds? Imagine the ease with which one with this 'ultimate-insiders' access can generate funds!
The net-result will be lowered return on capital to the rest of the investing community and an incredible temptation to those well-connected.
I smell a scandal brewing - but still, it may provide an incentive that surpasses even Google stock-options!
Smart. Find a needle in a filtered haystack,
before that needle sticks someone.
Real world intel = data.
Real world practices in % to data = heuristics.
Real time and historical intel = filtered haystack, with above.
Testing data and risk management priorites with auto communications = programs..
What else would you expect from the NSA?
Popcorn and butter machines?
Today, rapid deployment can stop a massive problem, if only
given the program.
I'd been disappointed if they didn't have this
down to a science many years ago.
What ever happened to the idea, be 15 years ahead of industry?
Just my .02$
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.