Identifying People from their Driving Patterns

People can be identified from their “driver fingerprint“:

…a group of researchers from the University of Washington and the University of California at San Diego found that they could “fingerprint” drivers based only on data they collected from internal computer network of the vehicle their test subjects were driving, what’s known as a car’s CAN bus. In fact, they found that the data collected from a car’s brake pedal alone could let them correctly distinguish the correct driver out of 15 individuals about nine times out of ten, after just 15 minutes of driving. With 90 minutes driving data or monitoring more car components, they could pick out the correct driver fully 100 percent of the time.

The paper: “Automobile Driver Fingerprinting,” by Miro Enev, Alex Takahuwa, Karl Koscher, and Tadayoshi Kohno.

Abstract: Today’s automobiles leverage powerful sensors and embedded computers to optimize efficiency, safety, and driver engagement. However the complexity of possible inferences using in-car sensor data is not well understood. While we do not know of attempts by automotive manufacturers or makers of after-market components (like insurance dongles) to violate privacy, a key question we ask is: could they (or their collection and later accidental leaks of data) violate a driver’s privacy? In the present study, we experimentally investigate the potential to identify individuals using sensor data snippets of their natural driving behavior. More specifically we record the in-vehicle sensor data on the controller area-network (CAN) of a typical modern vehicle (popular 2009 sedan) as each of 15 participants (a) performed a series of maneuvers in an isolated parking lot, and (b) drove the vehicle in traffic along a defined ~50 mile loop through the Seattle metropolitan area. We then split the data into training and testing sets, train an ensemble of classifiers, and evaluate identification accuracy of test data queries by looking at the highest voted candidate when considering all possible one-vs-one comparisons. Our results indicate that, at least among small sets, drivers are indeed distinguishable using only in car sensors. In particular, we find that it is possible to differentiate our 15 drivers with 100% accuracy when training with all of the available sensors using 90% of driving data from each person. Furthermore, it is possible to reach high identification rates using less than 8 minutes of training data. When more training data is available it is possible to reach very high identification using only a single sensor (e.g., the brake pedal). As an extension, we also demonstrate the feasibility of performing driver identification across multiple days of data collection.

Tags: academic papers, biometrics, cars, identification

Posted on May 30, 2016 at 10:10 AM • 32 Comments

Comments

r • May 30, 2016 10:19 AM

“WARNING! WARNING! Known hostile driver now merging on your right.”

THANK GLOD
Early warning system for road rage and drunk drivers/drug addicts.

An • May 30, 2016 11:11 AM

Not to belabor the bleeding obvious, but my driving is radically different when I have the kids in the car and we’re heading home to do homework and cook dinner vs when it’s just me in the car and I’m late to pick them up.

Drunk me drives really differently too. Sober me doesn’t expect trees to get out of the way when I honk.

John D. Muccigrosso • May 30, 2016 11:38 AM

You can tell who’s driving our car just from the position of the rear view mirror alone.

MikeA • May 30, 2016 11:42 AM

One more reason to use a trusted (but expendable) chauffer.

If (driver() == “Vito Corleone”) trigger_bomb();

would not know who the passenger is. Unless, of course, the driver drives differently (as An notes) when the Godfather is on the car.

More research needed.

paul • May 30, 2016 12:01 PM

Having access to the CAN bus makes it easy. I wonder how accurately you could identify drivers with just a cheap accelerometer.

We already have cars that can adjust mirrors, seat and so forth to customized positions based on key fobs. Maybe the next step will be using the CAN data to sense a driver’s mood and cognitive state, and to adjust acceleration, braking and handling profiles in response…

Clive Robinson • May 30, 2016 12:15 PM

This is just another version of “handwriting analysis” and probably about as reliable.

It’s possible because of our “monkey brain” response when we have sufficiently mastered a physical task, that we don’t have to use our reasoning and logic brain higher order functions.

Some people cab hear a recording of music and identify the person playing a particular instrument (like James Gallway and his flute, or various clasical trumpet players).

However will it suffer from the “stone in the shoe” issue with the likes of gait analysis. That is if the person adds a physical painfull constraint will it sufficiently effect the results in an unknow group size (that is think it’s a driver that is new to the vehicle rather than one who is a regular user in a small group).

Roger Wolff • May 30, 2016 1:43 PM

The problem with this sort of research is always: how sure can you be that they didn’t have 100 knobs on the algorithm to adjust?
If that happens, they train their algorithm on the training data, verify the results on the test-data, and… only 2/10 correct. Back to the drawing board. Adjust algorithm (turn a few of those knobs) and try again. With 100 possible adjustment points in the algorithm, it becomes likely that you can adjust the algorithm to give any desired result on the test-data (with 10 subjects).

A long time ago… There was a smart guy here in Holland who analyzed the stock market and developed an algorithm that would tell him when to buy and sell. At his presentation (“come invest in my fund”) I asked if he was sure he didn’t train his algorithm on “his test-data”. He was sure he didn’t…. A year later it was evident that he did. Even though he was sure to train say on 1970 through 1985, and then test on 1986, he had enough “stuff to tune” to make it work exceptionally well on the 1986 data. But when it finally ran “in real life” in 1987… it did way worse than “average”. The fund quickly went broke (without any of /my/ money).

So if you, as a researcher in THIS research end up with lousy recognition on the first test-run, you say the algorithm is not perfect yet, and tune a few parameters. Chances are you end up working towards the test-set. Through adjustment of say the weights of all the parameters, you can without knowing it force say: “the one who made the emergency stop is Tom”.

Slime Mold with Mustard • May 30, 2016 1:53 PM

This is not good news for @Wael, if he rents cars 😉

I have long suspected that car rental companies share a database of what they pull off of their spy dongle when you turn the car in. Every single time I rent a car, I get the extremely hard sell for insurance (my regular policy already covers me). I suspect that they know that there exists near zero chance they will need to pay out anything. The other customers at the counter are not badgered like this. My spouse often berates me for only going five MPH over the posted limit.

I only drive “adventurously” when tailing someone, and it is necessary to speed down a parallel street so as to allow the subject to have a rear view without us in it. If the subject regularly speeds 15 MPH over the limit, it is better to stick a GPS on their car, and remain unseen. Too slow is a separate issue. I have seen cars with diplomatic plates driving 25 MPH (half the limit) in clear traffic on I-495 (the D.C. Beltway). Even the FBI is going to have trouble with aircraft there, because of Reagan National Airport.

albert • May 30, 2016 2:33 PM

“Cradle to grave” monitoring (literally ‘to grave’ if CAN records a 100G shock).

Who cares if driving profiles aren’t reliable? Since when has LE cared about reliability, when they’re out searching for suspects?

How about reduced insurance rates for folks with ‘good’ profiles? Let’s throw in alcohol and marijuana sensors. Heart rate, breathing rate, and GSR sensors on the steering wheel. Brain wave sensors in the roof panels. Eye motion sensors in the rear view mirror.

Have I missed anything?

Orwell references are so…..eighties.

. .. . .. — ….

Wael • May 30, 2016 3:28 PM

@Slime Mold with Mustard,

This is not good news for @[…], if he rents cars 😉

Finally someone gave my the time of day. Was wondering if I fell out of grace somehow. I’ve fallen and can’t get up. Thanks for the bite 😉 “He” used to rent cars every week for a few years. Not so much these days.

People can be ‘identified by their xxx’ isn’t surprising anymore; it’s, more or less, expected because every researching _PeepingTom, Dick_Head, and _DirtyHarry^™ (emphasis is on Tom and Dick, may fire and brimstone consume them) is coming up with new xxxPrint these days to make money by selling their methods to Harry (may fire and brimstone consume him, too!)

What’s next? Eating habits, purchasing habits, writing style habits, Breathing patterns, perspiration prints, remote DNA sensing, reading habits, whom the person interacts with, bathroom habits (the question stupid interviewers often choose to ask a guest: Do you fold toilet paper or crumble it! Lady, I’m an Astrophysicist! Why do you think your audience cares about what I do with toilet paper?) The digital fingerprints we leave behind are limitless. And the “Toms” and “Dicks” of this world are popping up like wild mushrooms. And Mr. Harry is only happy to foot ze bill…

R. J. Brown • May 30, 2016 3:39 PM

The identification of the driver by his driving style is documented in the time of the Biblical King David:

2Ki 9:20 KJV
¶ And the watchman told, saying, He came even unto them, and cometh not again: and the driving is like the driving of Jehu the son of Nimshi; for he driveth furiously.

Of course, the chariats did not have CAN busses back then… 😉

Clive Robinson • May 30, 2016 3:39 PM

@ Albert,

Have I missed anything?

Well there are sensors that could go in the seat, for lax sphincters or clenched gluteus… Or even a CAM sensor to measure emmisions from the lower GI tract.

Heck when scraping the bottom any barrel will do whether you are over it or not 😉

Paeniteo • May 30, 2016 3:42 PM

@Roger Wolff:
I believe that’s why you -roughly speaking- generally take random samples from your entire data base to train and predict the rest.
After each tuning, you take a different random sample to re-train.

Clive Robinson • May 30, 2016 5:55 PM

@ Wael,

Finally someone gave my the time of day. Was wondering if I fell out of grace somehow.

Hmm if I had known you needed a “booster” I’d have poped into MotherCar to get you one, rather than taken a back seat, you should have come forward sooner 😉

CallMeLateForSupper • May 30, 2016 6:07 PM

TL;DR
BUT,,, I am curious to know number of subjects and number of types of test vehicle. I suspect that results from testing with ten subjects and one vehicle would not map well to the results from testing with 1,000 subjects and 100 vehicles. Just a hunch.

I gotta go lest I be called late-for-late-supper.

Safe remainder of Memorial Day, all. And do remember them.

Dirk Praet • May 30, 2016 6:14 PM

@ Wael

What’s next? Eating habits, purchasing habits, writing style habits, Breathing patterns, perspiration prints, remote DNA sensing, reading habits, whom the person interacts with, bathroom habits …

And then, of course, there are still people who keep on whining that they are going completely dark.

ianf • May 30, 2016 7:05 PM

@ Darren Chaker ask a truly Smart-Alec™ question: if all of mass surveillance deals with protecting lives, why not start with something which kills 1,300 people a day? (by that Darren apparently means deploying surveillance instead to avert death by cigarette smoke).

Who said surveillance is for protecting lives? Last I heard it was to protect Our Ways of Life – there’s a difference, and if you don’t get it, then obviously you’re not Our Kind of People With The Protection-Worthy Ways of Life.

die not in vain

Darren goes on about having written an article about how to implement basic security to stay on top of security and privacy – encrypt phone, use strong PWs, full disc encryption, VPN, etc.

Basic security on top of security – got it. By the looks of it regurgitated pedestrian advice with 0 (“zero”) novel solutions for scaling that enlightened plateau of security on top of security – got it.

Darren then obsesses over the purpose to all that: “it is to save lives. He would hope there is a pecking order to do so.” [OBSERVE: quote taken out of context by devilish design!]

For a brief moment there I was trying to determine how e.g. by not following Darren’s instructions on VPNs and strongPWs I would be endangering my life (esp. as there already might exist some pecking order)… and then I elected to devote myself to matching socks in the dry clothes hamper instead.

But all is not lost. Darren ends up by reminding us of the “remaining fact” that you (“you” not specified, but could well be YOU who reads this) “cannot truly control those you govern unless you know all of its secrets and lies.”

NOTED.

@ Wael, temporarily feeling tutto abbandonato: “finally someone gave my the time of day.”

Better you heed this truly life-saving advice from an atheist: be careful what you wish for, ’cause you just might get it.

Dave • May 30, 2016 7:19 PM

Or the police could just get a picture of the car’s license plates from any red-light camera. That usually narrows down the search for the driver.

Honestly, once the cameras are there, and the technology to automatically read plates is already developed, does anybody seriously believe there is no branch of the government that is secretly generating a database of all the plates scanned at each camera?

And what about the database of which SIM cards are connected to which cell towers and when?

Mass face-recognition technology is the next one. Then you can’t even walk around in public and not have your whereabouts tracked by the government.

Lizard and NWO training in: Cold Slither / GI JOE • May 30, 2016 8:03 PM

Lizard and NWO training in: Cold Slither / GI JOE

“We’re Cold Slither; you’ll be joining us soon.
A band of vipers playin’ our tune.
With an iron fist/
And a reptile hiss/
We shall rule!

CHORUS:
We’re tired of words! We’ve heard it before.
We’re not gonna play the game no more!
Don’t tell us what’s right, don’t tell us what’s wrong!
Too late to resist, ’cause Cobra is strong.

We’re Cold Slither, heavy metal machine
Through the eyes of a lizard king you will dream.
When the venom stings/
A new order brings/
Our control!

(CHORUS)”

Wael • May 30, 2016 8:47 PM

@Clive Robinson,

you should have come forward sooner 😉

Yes. That hit a spot 🙂

@Dirk Praet,

there are still people who keep on whining that they are going completely dark.

These would be the “Harrys” of this world. Cybersecurity Theater Presents “Going dark:Encryption”.

@ianf,

temporarily feeling tutto abbandonato: “finally someone gave my the time of day.”

Nobody’s perfect. By the way, I meant “me”, not “my”.

be careful what you wish for, ’cause you just might get it.

Your caution will become true if I reply in detail to this link 😉 I noticed the .se in the link. Don’t get sly on me 😉 So you’re ianf from Sweden, today?

Clive Robinson • May 30, 2016 10:25 PM

@ ianf,

From the article you link to,

We would rather test than trust God.

To dam true we do in the UK… After all we don’t even believe in the Loch Ness Monster, even though there are photos. Oh and don’t get me started on UFO’s over Glastonbury Tor…

It must irk those “In God we Trust” CIA types, after all if a few more people would just extend the “distrust” to include the politicos and the likes of the IC who feed the politicos what they want to hear… Then maybe we would not be in the mess we are.

Solar Dirty Harry • May 30, 2016 11:27 PM

All is not lost friends.

Just wait for the sun’s massive coronal mass ejection/solar super-storm – lets party like it’s 1859 all over again! – to cause magnetic fluctuations in the Earth’s magnetosphere, inducing electricity in large, powerful conductors and (hopefully) overloading most electrical systems and causing massive damage.

Overnight our solar Dirty Harry will wipe out satellites, power grids, large numbers of computers/cellphones/all other devices, communication systems, data centers like Utah, cameras and …. drum roll …. shit-stain tracking devices in rental cars.

With society reduced to 19th century functioning in a heart-beat, the mindless zombies everywhere will shuffle aimlessly in a daze, bereft of their technological surveillance dignity. Spooks will weep tears over their fried motherboards and irretrievable data. Indeed, it could spark a world-wide revolution amid the panic and breakdown of ‘civil’ society and law and order.

‘Change we can believe in’ TM.

Donny • May 31, 2016 11:32 AM

Bitdefender Anti-Ransomware is looking good!!

Direct Download from official site:
http://download.bitdefender.com/am/cw/BDAntiRansomwareSetup.exe

It may not stop ALL ransomware, but it receives updates and protects against some of them. The link above will probably remain the same throughout new versions/updates. It will launch and appear in your tray once you install and reboot your computer. I like it, it’s simple and free(ware). I wish it was open source though.

ianf • June 1, 2016 3:09 PM

@ Dirk Praet: there are still people who keep on whining that they are going completely dark.

They are not mere people, they are people-who-need-people, complete with an anthem, which is a wholly different ^CLASS of people. On the other hand there are people who do go completely dark, which then leaves still other people perplexed beyond call of duty; although, in this particular case, had the police broken free of its genus centric bias, and conducted the initial search for the missing Canis Familiaris using own Canis Familiaris, rather than for the Homo Sapiens using other Homo Sapiens, they might’ve cracked it right away. Too late now, as Boris’ The Adorable Dachshund scent has gone stale, wafted away in the wind.

@ Wael: So you’re ianf from Sweden, today?

I must be, Gurgle never lies (also collect chainsaws acc. to me top result there). Remember what Mitch Henessey told Larry King in “The Long Kiss Goodnight:” “Are you frank and earnest with me?” – In Chicago I’m Frank, and in New York I’m Ernest—and he was speaking the scripted truth!

BTW. that your clip is some skewed AngloTeutonic image of a Swedish mädchen, quite untrue by my cunt. This one is closer to the core.

Wael • June 1, 2016 3:26 PM

@ianf,

BTW. that your clip is some skewed AngloTeutonic image of a Swedish mädchen, quite untrue by my [redacted]

Oh my! Can’t stop laughing 🙂 Such a degrading typo!

Jeremy • June 2, 2016 2:25 PM

Identifying one subject out of a pool of 15 (with up-to-date measurements for each) does not sound particularly impressive, even with 100% accuracy. You could often do that with something like shoeprint or hair length that we KNOW is common between many people AND changes for a given person over time. That doesn’t necessarily imply that it has any practical value as a general identification tool.

Jim Lux • June 4, 2016 9:22 AM

This is probably like many other biometric modalities such as keyboard rhythm, mouse usage, walking gait, cell phone grip, etc.

They’re all about 80-85% accurate when comparing against a set of known templates (e.g. 10% chance of false accept, 10% chance of false reject).

Fine for automatically setting the car seat position or customizing car radio presets, probably not so fine as an anti-theft mechanism for your car.

A Nonny Bunny • June 4, 2016 2:49 PM

@Roger Wolff

The problem with this sort of research is always: how sure can you be that they didn’t have 100 knobs on the algorithm to adjust?
If that happens, they train their algorithm on the training data, verify the results on the test-data, and… only 2/10 correct. Back to the drawing board. Adjust algorithm (turn a few of those knobs) and try again. With 100 possible adjustment points in the algorithm, it becomes likely that you can adjust the algorithm to give any desired result on the test-data (with 10 subjects).

That’s why we were taught to keep back a third data-set, a validation set, which is used only for measuring the performance you report in your paper.
The final test should always be on a subset of the data you haven’t used in any way during the development of your classifier.

I’m more worried that the research is based on only 15 subjects. It’s easy to find something that distinguishes 15 people (it’s just 3.9 bits). But it may very well not scale at all.

Clive • June 6, 2016 1:26 AM

Interesting that this story gathers comments which are mostly discussing the topic of privacy (which is entirely reasonable, by the way), yet none seem to be exploring this research as a mechanism of authentication.

Specifically, could the techniques applied in this research be adapted to work in other scenarios (say for example monitoring the use of a keyboard and mouse) so as to allow a program that monitored these peripherals to be able to recognize the user based on their typing style and mouse gestures?

We might not want to rely solely on something like this to protect valuable information resources, but combined with even a trivial password, this starts to look a bit like a viable 2-factor solution that would require nothing more than some software and ingenuity…

Just a thought…

Clive Robinson • June 6, 2016 3:46 AM

@ Clive,

…yet none seem to be exploring this research as a mechanism of authentication.

We know from other types of biometric analysis that such biometrics tend to be not particularly usefull outside of very small communities due to quite high error rates. Which also means their effective entropy equivalent is very small (two to four bits at best).

Thus you need four or five seperate independent biometrics of this form to get a reasonable level of assurance even in small groups.

Thus these types of biometrics tend not to scale very well, and quickly become more intrusive and annoying than frequent enforced changes of pass word/phrase.

P.S. As there are atleast two “clives” on this list it helps others distinguish our comments appart if you also use another initial or your surname as I do.

ianf • June 6, 2016 7:17 AM

ADMINISTRIVIA :: Writes Clive Robinson: “As there are at least two “Clives” on this list, it helps others distinguish our comments appart if you also use another initial or your surname as I do.”

Since the other Clive, apparently believing itself the only Clive-That-Matters, hasn’t seen this coming, talking reason to it sort of misses the point. Maybe it’s a foundling that needs to reassert itself any which way it can? Simply address it as Clive Not-Robinson, or, to underline the hierarchy of posters, The Second-Banana Clive—that’ll teach it a lesson it’ll never forget.

Not the Stig • June 10, 2016 9:22 AM

Sure – that is easy.
Time between going off throttle and then braking.
How much and long braking to a stop.
Percentage of throttle when accellerating.

My braking is highly unique. I nearly always doublepump.

I can tell just sitting in the car everyone drives differently.

Oh and the height difference in my family is NIGHT and DAY mode in the mirrors. So so real adjustment there.

Schneier on Security

Identifying People from their Driving Patterns

Comments

Leave a comment Cancel reply