Schneier on Security
A blog covering security and security technology.
« Security Fears of Wi-Fi in London Underground |
| Unanticipated Security Risk of Keeping Your Money in a Home Safe »
April 14, 2011
Changing Incentives Creates Security Risks
One of the things I am writing about in my new book is how security equilibriums change. They often change because of technology, but they sometimes change because of incentives.
An interesting example of this is the recent scandal in the Washington, DC, public school system over teachers changing their students' test answers.
In the U.S., under the No Child Left Behind Act, students have to pass certain tests; otherwise, schools are penalized. In the District of Columbia, things went further. Michelle Rhee, chancellor of the public school system from 2007 to 2010, offered teachers $8,000 bonuses -- and threatened them with termination -- for improving test scores. Scores did increase significantly during the period, and the schools were held up as examples of how incentives affect teaching behavior.
It turns out that a lot of those score increases were faked. In addition to teaching students, teachers cheated on their students' tests by changing wrong answers to correct ones. That's how the cheating was discovered; researchers looked at the actual test papers and found more erasures than usual, and many more erasures from wrong answers to correct ones than could be explained by anything other than deliberate manipulation.
Teachers were always able to manipulate their students' test answers, but before, there wasn't much incentive to do so. With Rhee's changes, there was a much greater incentive to cheat.
The point is that whatever security measures were in place to prevent teacher cheating before the financial incentives and threats of firing wasn't sufficient to prevent teacher cheating afterwards. Because Rhee significantly increased the costs of cooperation (by threatening to fire teachers of poorly performing students) and increased the benefits of defection ($8,000), she created a security risk. And she should have increased security measures to restore balance to those incentives.
This is not isolated to DC. It has happened elsewhere as well.
Posted on April 14, 2011 at 6:36 AM
• 45 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
I believe this particular cheating problem is related to an interesting general one: You can use a score to measure something, only if the score is not used for any side purpose related to financial gain. My favorite example is Chess ratings. These are obtained objectively by statistical analysis of win/loss results, and they can be highly accurate. Their accuracy has been compromised by tournament prizes. For example, if a tournament offers a desirable prize for the best result by a player rated under 2,000, it is well-known that some players will get their rating down under 2,000 before the tournament in order to qualify for this prize.
tobias d. robison
Shouldn't that read " Rhee significantly INCREASED the costs of cooperation "
Rewards based systems work. I doubt all teachers were cheating, but there were probably a few more gaming the system than usual which would definitely raise suspicion.
What I gather from this is not that rewards based systems don't work, but that before implementing a rewards based system, perhaps take a step back, look at the big picture and strive to remove any incentive to cheat by implementing security measures that prevent cheating.
Bruce, you should read the book "Measuring and Managing Performance in Organizations" by Robert Austin (Dorset House, 1996, 240 pages), if you haven't already.
The book describes the pervasive malady of "measurement dysfunction": where the metrics used to measure performance go up, but the actual performance stays the same or worsens. The problem is inherent in the act of measurement. Greater "security" around the measurement process does not necessarily solve the problem; the security measures themselves may introduce yet another avenue for dysfunction.
For example, in the case of the D.C. tests, if a penalty for too many erasures on test papers had been added -- a security measure to prevent teacher cheating -- the teachers might have substituted clean, correctly-filled-out (or almost-correctly-filled-out) papers for the actual papers, rather than erasing the answers on the students' papers. This would lead to further security measures and further dysfunction in a never-ending arms race, just like the one in the computer security arena.
Read the book.
And I'm sure that many of those teacher justified their actions (to themselves, at least) by thinking that they were going to be punished for conditions that were not under their control.
(as in, no teacher can do much to fix up a student's dysfunctional home life or pervasive poverty, both of which has a huge negative impact on their learning)
I'm also sure that SOME will be hollering that it's all about greedy teachers ripping off some extra cash. As if public school teachers go into teaching to score some serious bux.
When employers impose unfair conditions on their employees, they shouldn't be surprised when those employees subvert the process.
This is where we're headed with our health care system, too. New metrics and incentives for measuring health care performance means hospitals and clinics of all sizes are scrambling to find ways to improve the numbers.
If improved education or health care is the path of least resistance to better numbers, we all benefit. Usually, it's not.
The efficiency gains of closely tracking and aligning reward to performance is counterbalanced by the expense of security, auditing, and other measures to prevent gaming the system.
The students should find a way to move just a little south. Since NCLB and similar initiatives, Virginia public schools now teach-to-the-test to the point that the world is now a series of multiple choice answers and canned essay responses.
Things are likely similar in:
a) All states south of the Mason-Dixon line.
b) All states east of the Mississippi Rover.
c) All states calling themselves a "Commonwealth."
d) a and c
e) None of the above.
Mississippi Rover is a dog currently living in Arkansas.
I've seen some amusing things in my time about incentives. One was punishing people who encountered a virus, in which case they hid it. Conversely, another entity (thankfully not mine), rewarded people for reporting incidents, so people infected/comproised their own systems to report it and get the reward.
I argued with a federal reviewer who wanted password changes restricted except the normal interval so users would have to report disclosures. Nice in theory, but the fact is if people had to risk embarassment or reprimand, they'd clam up and hope for the best. Best bet was to let them change it so security was in their best interest.
I could go on and on about boneheaded things places do that creates perverse incentives. Thanks to Bruce for his "Psychology of Security" and "Prospect Theory" writings some years back that shifted my gears and made me a more reasonable auditor.
The book "Freakonomics" (possibly "Superfreakonomics" but I don't think so) had a chapter on teachers cheating in just that way. It was detected because certain sections of the test had identical answers for all students.
One thing that bothered the authors was that, in some cases, the teachers got answers wrong. It does make a certain amount of sense: if a teacher doesn't understand a subject well, the teacher is unlikely to teach it well enough for most students to pass the test, and so cheating will seem more necessary.
And this cheating event is merely a small subplot of the actual systemic failure, namely, that measuring the scores produced on a multiple-choice test does not actually measure how well the students are being taught.
Moving from "teach solely for the purpose of scoring on a mandatory examination" to "modify the answer sheets" is just moving further along on the spectrum of dishonesty.
Really, I think this just raises the need for better test proctoring. I don't see how someone who sees cheating like this can conclude "Well we should just not test then."
If you want to see an incentivesed system in action go n further than IT security.
First of none of the metrics used actually mean anything in a usefull way ie "we detected 290,012 attempts at system break ins last month" tells you little because you can only compare it to the previous months, and as the detection huristic comes out of the AV software which is continually changing...
However the security software suppliers have to sell software, something so it is only in their interest to make available what their direct customers want. Not what those direct customers managers want.
Further the managers don't actually want security they want compliance to avoid consiquential loses.
Therefor few people in IT Security actually want usefull security metrics, what they realy want is "smoke and mirrors" so that they can look like they are achieving som degree of security. When in all probability they are actualy seaking highly maluable compliance metrics, they can show to an auditor to get the required level of ticks on the auditors check list.
Which begs two questions,
1, Who's kidding who?
2, Does either party care?
Testing measures how well a student performs on a test, not how well the student has actually learned the material. Through my post-secondary education, I know quite well that I learned far less of the coursework in classes that relied on testing as opposed to classes that relied on larger take-home projects. Bullshitting through a multiple-choice test is not usually a skill that translates well to one's post-education career.
The fact that we've turned the nation's primary and secondary education systems into relying solely on testing (without even really thinking about what it is that we're measuring!) is the systemic failure; this incident is just a symptom of it.
"Bullshitting through a multiple-choice test is not usually a skill that translates well to one's post-education career."
Unfortunately, I find that being able to confidently select an answer from a menu of options via nothing more than pure bullshit is in high demand in industry these days.
Is this a security-design problem or an incentive-design problem? If a revised system makes it attractive for many formerly within-the-rules actors to cheat, it seems to me that that's a sign the revision may well be ill-conceived.
"The book describes the pervasive malady of "measurement dysfunction": where the metrics used to measure performance go up, but the actual performance stays the same or worsens."
Or, as a project manager I used to work with said, know exactly what you are measuring because that is what you will get.
In the example above, they aren't measuring the knowledge of the students.
They are paying teachers for X correct checks on a page. If the teacher "thinks outside of the box" then the teacher will realize that student knowledge is not a factor. All that matters is having Y bodies signing the pages and making sure that X correct checks are recorded - without any outside party becoming aware of the change.
Even South Park did an episode on this.
This just sounds like old-fashioned bribery/threats to me.
Nothing security-related here, unless the person bribed is performing a security function.
The security measure against exam cheating, analyzing unusual erasure patterns, seems to have been more or less in place but not known by the potential offenders. The teachers who cheated didn't know their erasure patterns could be analyzed and catch them cheating.
Agreed that incentives change security equilibriums, but in this test cheating case it is also worth noting that an existing security measure could have prevented the cheating if it had simply been implemented in a public rather than in a hidden mode.
Similarly, surveillance cameras, traffic speed radars, sports doping tests, and other security measures will have very different effects depending on whether they are operated in an open or in a hidden mode.
Seems this may be true: Security measures usually are more effective when they are open rather than hidden.
@j: ``Testing measures how well a student performs on a test, not how well the student has actually learned the material.''
Many talk about how exam-paper modifications are gaming the testing system. Are you having the temerity to suggest that NCLB testing is just gaming the educational system? :-/
A fascinating aside for those thinking about the actual teaching methodology the 'teach to test' promotes might enjoy revisiting Feynman's speech to the university in Brazil, as recounted in that surely your're joking book thingy.
Grading students fairly is hard. It requires both high personal integrity and high competence in the subject matter. It is a judgement call, and there is no effective way around that. Even in written exams with predefined grading guidelines, a crafty examiner can at least have a +/-20% impact on individual grades and can increase the scores of some students while decreasing that of others.
The one exception is multiple choice, but these examinations are basically worthless, except for very basic, purely knowledge oriented pass/fail exams.
I don't really see a practical solution for this. Of course, in theory one could replace all teachers with people of said high integrity and skill, but this falls short in practice a) because there are not enough of those and b) teaching is far to unattractive to get many of those that are there into it in the first place. For me, it is something I do in addition to my normal work, it is on university level, I have generous freedom in what and how I teach and it is very well compensated. With badly paid and educated teachers, who in addition often do not have significant insights into how the real word works, let alone the sciences, it is not a question on who gets left behind, but one of who manages to acquire important insights and skills _despite_ the teachers.
I can also say that this is a long-standing problem and it is a global one. Nobody seems to be doing much better than the others, with a few exceptions in some small countries. Their solutions are doubtful to scale or transfer at all. It may be more due to a different population mind-set than due to better teaching, along the lines "we are small, so we have to be better than others".
Anyways, with the demand the modern world places upon the individual not only in knowledge, but especially in skills and insights, I have the impression that the everyday complexity of the world has far outstripped our capacity for teaching how to handle it to the next generation, at least until some rather drastic, risky, fundamental and not too clear changes are made. All the models for society tried on a larger scale so far clearly do not cut it.
"The teachers who cheated didn't know their erasure patterns could be analyzed and catch them cheating."
The cheating went on for nearly three years before being discovered. The schools with multiple cheating teachers received awards, and the principals and teachers got cash bonuses. Thus, there was a disincentive for the administrators to detect cheating.
There was a case a few years ago (I cannot remember the location) where teachers cheated by providing hints during standardized exams. The teachers promised rewards to the students if they got good scores. Everyone in that scenario had incentives to cheat: the students got rewards, the teachers got to keep their jobs, the principle was rewarded for "better" outcomes, and the school district received additional funding based on the improved scores.
An aside: I find it interesting that teach and cheat are anagrams.
@Clive Robinson: Unfortunately you are right on the mark. Some people will even resort to complete nonsense as counting scan packets to have the largest numbers they can get.
Another example of a general principle: Humans don't like to be controlled. But they do like to control other humans.
Pure primate behavior.
And the reason why you have ever increasing security measures (and "security theater") but always ending up with ever decreasing "security".
Which yet again proves: There is no security. Suck it up.
Both physics and IT speak of an "observer effect," in which the very act of observing an item can affect the item itself.
This is obviously a very broad issue that goes well beyond NCLB or education or security.
Imagine what Will Geddes of ICP Group could do with THAT.
One thing about the many state test is there is no incentive for the students to meet a standard. This is good because the kids have little incentive to cheat. I think it should also be used as a measurement of cheating by students. If a student does much better on tests that do effect promotion, there is good chance this is more than 'just trying harder'. As far as I know, such things are not tracked.
What is interesting is that all test effect a teachers status. Even if the teacher does not test their own students, which generally does not happen, the performance of the school does figure significantly in bonuses. As such there is almost no incentive for a teacher to proctor a test beyond the minimuns required to avoid professional misconduct charges. This system which on balance provides neutral incentives for proctoring has always seem quite backwards to me. Of course hiring impartial proctors would be prohibitively expensive, so we are left with quite unreliable results. Like a GPA, the data would best be used to rank students within a school, or average growth over a number of years, but as a statewide metric the usefulness is quite in doubt.
Bruce--looking forward to reading the new book.
@JEB "Both physics and IT speak of an "observer effect," in which the very act of observing an item can affect the item itself. "
In FBI funded domestic terror observations the item above being a targeted citizen, suffers the effects of long term harassment/stalking.
Incentives, whether they come as a reward or as a punishment, will cause some to work harder, and some to work smarter. In general, they are a fine indicator of where you are in the food chain. Foot soldiers more often will be threatened with loss of their livelihood whereas officers will be rewarded with bonuses and other benefits for making their targets.
The one side-effect that however applies to either category is that both will start working the system, gradually losing perspective of the bigger picture by focussing on personal targets, gains or losses only. The higher the incentive, the more likely folks will start exploring the boundaries of the system, taking absurd directives for granted, abusing gaps and loopholes, eventually subverting or perverting it. Sometimes, the result is that the original purpose of the system is totally defeated. In some cases, it can lead to a full collapse.
Just a few examples: some years ago, the town council of where I live created a service to keep cars out of the centre in order to make it a more lively neighbourhood. Nothing wrong with that. A lot of parking space disappeared, paid parking was introduced everywhere, and at very high rates. After a public outcry, locals could apply for a resident card allowing them to park for free in their zone. Control was outsourced to a commercial 3rd party employing primarily disenfranchised youngsters or otherwise unemployable folks to do the ticketing of offenders. As they have their quota, they're doing a very diligent job ticketing kinda everybody, including locals with valid parking cards, claiming "oversight" when somebody complains. The outcome of something that started as a great idea is that pubs, restaurants and other businesses have been disappearing fast because their former patrons are now taking their business to places where they are not being presented with either a 20 euro parking bill or a 30 euro fine after a shopping spree or a night out.
The same probably goes for TSA employees padding down senior citizens and even six year old girls ( http://travel.usatoday.com/flights/post/2011/04/... ). The only reason I can think of for such disgusting behaviour is that they will probably lose their jobs if they don't make certain "all age" targets. In any other place, such an action would most certainly get you lynched.
At the other side of the spectrum, I firmly believe that - together with lack of regulation - the outrageous incentive model in place at financial institutions was what in essence caused the 2008 financial crisis. And from what I'm seeing today, they still haven't learned their lesson.
Does anyone else here see additional, if a bit tangential, evidence [I don't think it is needed but some might] favoring hand marked paper ballots?
Your first commenter was talking about Goodhart's Law:
`When a measure becomes a target, it ceases to be a good measure.'
Jeffrey Pfeffer has written extensively about dysfunctional incentives e.g.
Isn't this the same dynamic that happened in the housing bubble/mortgage crisis? At some point in the past, the organization selling you the mortgage was also holding the risk of default, and so had no incentive to help you lie on the application to get a loan. Later, the organization selling you the mortgage became completely independent of the organization holding the risk of default, so there was an incentive to help marginal customers lie on the application forms in various ways--the mortgage broker got the commission, and had little reason to care if the person eventually defaulted on the mortgage.
@John Kelsey: When I was on a contract for a mortgage company, we had entries in the database for "stated income" and "stated assets", and they were not uniformly "N" like they should have been. What the companies were doing then was putting low-quality loans into bundles and selling off tranches of the payments. Besides, with housing prices continuing to go up rapidly, foreclosure was inexpensive enough to take risks with.
If the big companies hadn't thought they'd profit by shuffling loans and finances around even with bad loans, they wouldn't have bought the loans from the mortgage companies, and the mortgage companies wouldn't have issued loans they couldn't sell.
It wasn't a matter of incentives for cheating on mortgage applications, since the results of such cheating were knowingly bought up by financial giants. The incentives were to issue as many loans as possible, regardless of risk.
Just another aspect of the spiral to the bottom.
The whole testing regime is part of de-prefessionalization of teaching. As the satisfaction to be derived from teaching is systematically squeezed out of the system, what else do you expect?
Teachers are being told that they are no longer professionals, but a commodity, and a cheap one at that. They are threatened with firing, to be replaced by a cheap, untrained, nonunion nonprofessional. What more do you need to teach to the test?
So why shouldn't they cheat? Either way they are scheduled to be fired. At least by cheating they have a few more bucks before they are dumped.
The system once worked by using teachers' internal psychological drive to do the right thing (aka pride). It's the strength of that drive that makes civilization possible. Trying to replace it with threats and "security measures" is a one-way trip downward.
In the fourth grade (back in '98 or so), on a state standardized year-end essay exam, the topic I picked was "How a principal can improve standardized test scores." I advocated that they coach their teachers in proper cheating procedures, and gave some advice on how to avoid detection.
The score for the essay was a number 1-4, divided into 3 categories, plus an overall score. I got 4 on each category, and 3 on the overall. I also won the UIL persuasive writing competition the next year, so the 3 was almost certainly just moral outrage by the grader. My mother and I found that very amusing.
The point is that whatever security measures were in place to prevent teacher cheating before the financial incentives and threats of firing wasn't sufficient to prevent teacher cheating afterwards. Because Rhee significantly increased the costs of cooperation ...she created a security risk.
I wonder if in your new book you could do a bit on information assurance, and the mechanics of data mining/domestic observations with the telco count stuff that verint and others use? There is more of a security risk created by the amount of money NSA/FBI throws at watching false positives without audits that actually serves to corrupt the data, and creates a security risk (watching the wrong folks, having Arabic contractors who cheat you, having insiders who cheat you, opportunity cost of not watching real ones). I imagine that insiders who have to put count systems in context know how to cheat the system. In hard economic times, jobs are power...the contractors are big companies...software is fallible.
Why does this whole thing somehow remind me of the Vietnam body counts?
There is a myth of management that seems to require some form of numerical measurement even if what is being enumerated bears no relationship to what is the desired result.
The *BEST SOLUTION* is to require better training and provide encouragement for these teachers.
After all, how many of us would be happy and diligent grading tests week after week, managing 20-30 screaming kids for 6+ hours a day, making a moderate salary with very little growth prospects and zero income from company stocks and options??
Another good book on the subject of the distorting impact of incentives, and how they are games, is "Punished by Rewards" by Alfie Kohn.
Teachers fiddling with students test forms to boost their bonuses is small potatoes compared to CEOs of the likes of AIG, taking on risks that will eventually bankrupt their company, but produce outstanding returns and financial gains to the executive in question. Since investigating and jailing those crooks seems to be inconceivable for our venal politicians, the surprising thing is it was not more widespread.
Dr. T: "The cheating went on for nearly three years before being discovered. The schools with multiple cheating teachers received awards, and the principals and teachers got cash bonuses. Thus, there was a disincentive for the administrators to detect cheating."
IIRC (too lazy to go back and read the article), the cheating was discovered quite quickly; the company had their scanners programmed to count erasures, and their direction (wrong to right, right to wrong, wrong to wront). And they were notifying Ms. Rhee's office.
She, for some odd reason, didn't pay any attention to it until it was publicized, and she could no longer deny it.
In short, she is not innocent here. And again IIRC, here prior record before DC seems to disappear on actual examination.
The authors of Freakonomics covered this same phenomena in their book.
More recently, we see this type of thing happening in relation to cigarette taxes. Local governments started raising cigarette taxes out of a desire to dis-incentivize smoking. It wasn't long before they continued raising taxes out of a desire to fill holes in their un-balanced budgets.
But at some point, the cigarette taxes become so high that a new, black-market economy appears: people who smuggle and sell cigarettes to buyers wishing to avoid the taxes.
AIUI, NCLB requires a perpetual increase in school grades, on pain of yanking major funds. That's a recipe for forcing cheating, among other ills.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.