Schneier on Security
A blog covering security and security technology.
« Status Report on the War on Photography |
| Malware in Google's Android »
June 15, 2011
The Non-Anonymity of Bubble Forms
It turns out that "fill-in-the-bubble" forms are not so anonymous.
Posted on June 15, 2011 at 6:22 AM
• 28 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Interesting. I do think more research is needed though - the test conditions appear (on first pass) a bit simplistic and 92 papers is a very small samplesize for this.
However, the results as shown are worth considering.
It would be good to see a test of several papers handed in by a few hundreds of people to see if the test can track between individuals and papers. At the moment it just seems like a good way of making sure only one person filled a form in.
I have always assumed that "anonymous" forms have a unique identifier printed in UV or other invisible ink.
In order to identify who filled out a certain ballot, you would have to have a known ballot on hand for comparison purposes. That is like having a known set of fingerprints on record. How many crimes are unsolved because the fingerprints can't be matched? How many ballots can't be identified because they don't have a known ballot for matching?
More security theater? Someone wants to establish a basis for eliminating paper ballots?
I wonder how big a deal this is. 51% is only slightly better than a coin flip. Yes the number goes up when you say it is "one if these 5 or 10" people, but depending on what you are using this for, false positives will creep up.
Maybe this can be just another tool in identifying someone, used alongside but not instead of more reliable methods.
@anon: This isn't better than a coin flip. Coin flips are analogous when there are two choices. There were 92 choices in this sample set.
Better go read the article before commenting again.
@david: Read the article -- it's about "fingerprinting" the manner in which someone fills in the bubbles, not about any secret identifiers on the forms. Plus, you would need a mapping of secret-number-to-person to make use of that serial code anyway. And some ballots already have /visible/ serial numbers, so it's not of any use there to also have a secret serial number.
@kashmarek: Elections aren't the only place where this research is useful. Let's say you're a large company and the workers are thinking of unionizing. If upon employment you have them fill out a bubble form for some "process" reason then later take an "anonymous" survey about unionizing you know who to fire to get the vote to fail.
This should remind all of us to vary the way we "fill in the blank" when completing these types of forms.
In some "bubbles", fill it in with left-to-right (horizontal) strokes, while in others use an up-and-down (vertical) stroke. In others, use a spiral movement of your pencil.
Bubble forms == mutliple choice
As a result it is hardly surprising if you have ever seen a few of these multiple choice tests used in schools colleges or university.
Usually at a glance you can tell left and right handers which means that you have one set of aproximatly 20% of the population those considered as sinister,gauche,cackhanded,lefties and the 80% that include those who view them as such.
As a bio-metric I suspect it has a component dependent on the personality of the person.
Which brings us around to that wonderful psudoscience of handwriting analysis as a determinater of personality and loyalty etc...
This could lead to a paper or two I suspect ;)
As a college prof I had to deal with a lot of bubble sheets and I can say its absolutely true that you can identify the filler of the form from the way they fill it out.
Some students filled out the bubbles neatly and completely, others only the center of the bubble. Some went up, sideways or even diagonally with their strokes. Some rested their pencils on the paper, creating extraneous marks. And that doesn't even cover erasing, canceling out and such evidence of indecision.
The point is that over many tests and several years students were consistent in the way they filled out these forms.
I think you can tell a lot more from the form than just who filled it out. You may be able to identify personality traits.
Now theres a research project!
As the first commenter pointed out, the real question is how this performs on a much larger data set. They tested on a tiny sample of 92 users. Seems unlikely that there is enough variation in bubble filling that their simple technique would work on a data set of 100k users, which is what would have to be established to make claims about being able to identify how individuals voted.
>> David: I have always assumed that "anonymous" forms have a unique identifier printed in UV or other invisible ink.
If you take a look at mailed ballots and survey forms that have these bubbles, chances are better than not that they will have a little DataMatrix barcode on them, as well as tiny little numbers.
>> pencil pusher: This should remind all of us to vary the way we "fill in the blank" when completing these types of forms.
I think it would be more useful for all respondents to agree on a consistent method for filling in the bubbles, or at least make the analysis harder by filling them in perfectly.
The example on the left of "3" shows almost what the instructions tell you NOT to do: no scribbles, crosses or ticks. Use a #2 pencil to darken the whole circle.
As for Scamtron (sic), shown in the article graphic, there is really only one way that people commonly fill these (left and right back and forth), so there I propose there is a lot less variation in these.
@Greg Linden: The paper says election ballots aren't all mixed together, so you only need to be identified among the few hundred people that voted at the same elementary school.
The only practical application I see for this is identifying cheating teachers. It should be pretty easy to tell which exams have been altered by a second bubble filler correcting some answers.
"I think you can tell a lot more from the form than just who filled it out. You may be able to identify personality traits.
Now theres a research project!"
Read my post befor with,
"As a bio-metric I suspect it has a component dependent on the personality of the person.
Which brings us around to that wonderful psudoscience of handwriting analysis as a determinater of personality and loyalty etc...
This could lead to a paper or two I suspect ;)"
It might only be three minutes before but that's what counts ;)
Funny stuff this "morphic resonance" 8)
This research seems obvious to me.
Filling out a bubble form is a very specific behavior. People are obviously going to do it differently based on their approach to pretty much everything. As EMK says above, everyone is going to do it pretty much differently and consistently, and therefore identifiably.
If you can identify a Morse coder by his "fist", I'm not surprised you can identify someone by his "bubbling".
As for the issue of needing a sample, well, in most cases people who fill out a bubble form, anonymous or not, are generally part of a small group doing this task. The members of that group are usually known, be they members of a test or a class or all those on a given day or whatever.
In fact, I wouldn't be surprised if more research allowed experts to use the characteristics of the bubble filling to predict other behavioral or character traits of the person, narrowing it down even more, similar to handwriting analysis.
All of this is pretty obvious. What isn't obvious to me is why anyone would really care in most situations where bubble forms are used - except in the case Captain Obvious cites.
@kashmarek, @No One
From my experience (as an election worker in the United States), every ballot I've seen has a tear-away serial number.
The practice I was trained in had the serial number on the ballot until the moment before insertion in the counting machine.
The total number of ballots issued is counted, as are the total number of torn-away serial numbers. Also, the total number of ballots remaining is counted. Part of the confirmation of accurate-voting is accounting for the location of every ballot issued to the precinct. (They are sorted into categories Counted/Spoiled/Provisional/Unused.)
However, this news may mean that patterns could be seen in the absence of serial-numbers.
I don't know how usable such pattern info is in the realm of checking election results. A single person's vote may not be recovered, but correlating probabilities with suspected preferences of the individual may give some results.
This method may also give more evidence into cases of suspected fraud. If the vote-counters discover several thousand absentee ballots at an unusual location, this research might give clues into how many distinct ballot-marking styles were used in generating the ballots.
This is scary. Since the completed ballots are inserted into the ballot box in a chronological order, couldn't that be compared to the "sign in" sheet, and get an approximate time an individual voter arrived, and the position of his ballot within the stack of ballots in the ballot box?
Would'nt be surprised though if there were'nt some papers published on this in psychology or education.
Isn't this kinda obvious ? No influence on elections over here. Most parts of the country vote electronically nowadays. Which of course poses issues of a different nature.
The research is interesting in an edutainment sort of way but I think the authors and some of the replies overlook a significant fact. It is /trivial/ to change the way one fills in a bubble. It's not as if filling in the bubble is a complicated process like handwriting. I suspect that most people fill in bubble the same way because it's a mostly unconscious process. Make the process conscious again and where will that leave this research....in the dumpster.
Surprised no one brought up the alternative that was tried in, say, the 2000 POTUS election in the State of Florida: Punch-card ballots, using a punch tool provided in the booth.
Let's assume that the punch tools are all identical enough that tiny non-uniformities in the paper comprising the ballot would offset trying to ID the voter's booth by the magnified photo of the punch hole. People try to punch the middle of the target; some will be a little offset, but I suspect, randomly so. Unfortunately, some didn't punch completely through, leading to the famous "hanging chad" controversy (and numerous Viagra jokes).
Despite the ensuing election controversy, perhaps such punch-cards might be a better way to avoid the issues presented here.
Me, I always filled in the entire bubble, because it was warned that not doing so could invalidate the answer, leading to a lower test score, as would marking outside the lines. Let's hope the others read the same directions. In that case, you pretty much have to go left-and-right (for an oval target), perhaps a little up-and-down, and then some coloring around the edges to catch any missed spots.
The left sample shown in the article would presumably not be valid, and so those who heeded the instructions wouldn't have done that. The right sample breaks the "no marks outside" rule. I wonder what the conditions were in their survey, as far as how explicitly the subjects were told to mark the answers, and how carefully they were told they needed to do so. If they were less precise than most bubble-scored tests I've taken, then they've manufactured a straw man.
For more on the 2000 POTUS controversy in Florida, click my sig below. (No, it's not a trick, and it's safe. No POC or anything.)
In my jurisdiction, this would not be effective at identifying ballots, even though we use bubble forms. They use the old punch card machines, but now the holes are a bit bigger. There's a special pen designed to be used like the hole punch on the punch cards that marks the whole bubble with a single punch. I assume it was designed to ensure the ballots were marked neatly and completely with no room for ambiguity, but a side effect is that they're also marked uniformly. There's no room for individuality, including individual, identifiable marking styles.
If you look back at both posts the similarity is very high, and noticably different to all the other posts to the same page, which actually is quite significant.
Based on the likley assumption that you did not have time to read my post paraphrase and post yours, it means that both were original thoughts, which in turn shows that accidental plagiarism does indeed happen.
"Would'nt be surprised though if there were'nt some papers published on this in psychology or education."
Well as long as they put us both in the refrences I won't shout to loudly ;)
However this is actually something that has come up several times with this blog and my posts. I have extrapolated forward and made a couple of lines comment on the possibility of using the idea in novel ways, only to find down the road a PhD thesis has been written on just that idea.
The classic one was over the clock drift in computers being measurable across the net by use of TCP time stamps. I mentioned that it could be used to track a PC etc. A little while later one of the bods over at Cambridge labs published a paper on his PhD work based exactly on this, and his work went on to win an award.
Another more recently again at Camb Labs was the use of RF to cause fault injection attacks on security hardware. I had done some work on it back in the 80's and 90's but nobody was interested at the time so I went on to do other things.
Fair dues to all of them they did the work independently towards their papers and went through the slog of getting them published.
As the Internet gets broader in reach I expect to see a lot more of this in the future, which is going to make the job of finding original research harder and harder for people, and more importantly leave them open to claims they did not do their "current knowledge" search properly.
What then will be the idea of plagiarism will "accidental" and "deliberate" actually have any meaning and how will you judge it.
Coming up with an idea independently isn't plagiarism. Plagiarism is only when you copy without attribution (whether inadvertantly or not). You can't copy something you haven't seen.
The vote counting machines I'm familiar with dump the ballots into a bin. The dumping process, and the extraction of ballots for long-term storage (in special sealed boxes) by the Clerk produces enough scrambling to usually keep that from being a problem.
Further thought: it's not too hard to get a one-day job helping the local City Clerk or Township Clerk run an election. You'll learn about the election process; you'll learn about how they count/sort/handle ballots; you'll learn about election law. You'll probably get paid a nominal amount for a day's worth of work.
It's a rewarding experience.
Caveat: if you are from outside the U.S., I don't know how easy this is.
@voter: I don't know about voting practices everywhere, but when I show up I sign my name in the book under my printed name, so voter signatures are by alphabetic, not chronological, order. It would be possible for an observer to note in what order people voted, although I'm not sure an observer would be allowed to check the book to verify identities, but any scrambling of the ballot order would probably be sufficient to foil that plot.
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.