On the Anonymity of Home/Work Location Pairs
Philippe Golle and Kurt Partridge of PARC have a cute paper on the anonymity of geo-location data. They analyze data from the U.S. Census and show that for the average person, knowing their approximate home and work locations—to a block level—identifies them uniquely.
Even if we look at the much coarser granularity of a census tract—tracts correspond roughly to ZIP codes; there are on average 1,500 people per census tract—for the average person, there are only around 20 other people who share the same home and work location. There’s more: 5% of people are uniquely identified by their home and work locations even if it is known only at the census tract level. One reason for this is that people who live and work in very different areas (say, different counties) are much more easily identifiable, as one might expect.
“On the Anonymity of Home/Work Location Pairs,” by Philippe Golle and Kurt Partridge:
Abstract:
Many applications benefit from user location data, but location data raises privacy concerns. Anonymization can protect privacy, but identities can sometimes be inferred from supposedly anonymous data. This paper studies a new attack on the anonymity of location data. We show that if the approximate locations of an individual’s home and workplace can both be deduced from a location trace, then the median size of the individual’s anonymity set in the U.S. working population is 1, 21 and 34,980, for locations known at the granularity of a census block, census track and county respectively. The location data of people who live and work in different regions can be re-identified even more easily. Our results show that the threat of re-identification for location data is much greater when the individual’s home and work locations can both be deduced from the data. To preserve anonymity, we offer guidance for obfuscating location traces before they are disclosed.
This is all very troubling, given the number of location-based services springing up and the number of databases that are collecting location data.
Go2Null • May 21, 2009 7:50 AM
Wolfram Alpha will most likely get the census data. Extrapolate …