Dumb Risk of the Day

Geotagged images of children:

Joanne Kuzma of the University of Worcester, England, has analyzed photos that clearly show children’s faces on the photo sharing site Flickr. She found that a significant proportion of those analyzed were geotagged and a large number of those were associated with 50 of the more expensive residential zip codes in the USA.

The location information could possibly be used to locate a child’s home or other location based on information publicly available on Flickr,” explains Kuzma. “Publishing geolocation data raises concerns about privacy and security of children when such personalized information is available to internet users who may have dubious reasons for accessing this data.”

It’s children, though, so it’s going to be hard to have a rational risk discussion about this topic.

Posted on February 15, 2012 at 1:11 PM

Identifying People using Anonymous Social Networking Data


Computer scientists Arvind Narayanan and Dr Vitaly Shmatikov, from the University of Texas at Austin, developed the algorithm which turned the anonymous data back into names and addresses.

The data sets are usually stripped of personally identifiable information, such as names, before it is sold to marketing companies or researchers keen to plumb it for useful information.

Before now, it was thought sufficient to remove this data to make sure that the true identities of subjects could not be reconstructed.

The algorithm developed by the pair looks at relationships between all the members of a social network — not just the immediate friends that members of these sites connect to.

Social graphs from Twitter, Flickr and Live Journal were used in the research.

The pair found that one third of those who are on both Flickr and Twitter can be identified from the completely anonymous Twitter graph. This is despite the fact that the overlap of members between the two services is thought to be about 15%.

The researchers suggest that as social network sites become more heavily used, then people will find it increasingly difficult to maintain a veil of anonymity.

More details:

In “De-anonymizing social networks,” Narayanan and Shmatikov take an anonymous graph of the social relationships established through Twitter and find that they can actually identify many Twitter accounts based on an entirely different data source—in this case, Flickr.

One-third of users with accounts on both services could be identified on Twitter based on their Flickr connections, even when the Twitter social graph being used was completely anonymous. The point, say the authors, is that “anonymity is not sufficient for privacy when dealing with social networks,” since their scheme relies only on a social network’s topology to make the identification.

The issue is of more than academic interest, as social networks now routinely release such anonymous social graphs to advertisers and third-party apps, and government and academic researchers ask for such data to conduct research. But the data isn’t nearly as “anonymous” as those releasing it appear to think it is, and it can easily be cross-referenced to other data sets to expose user identities.

It’s not just about Twitter, either. Twitter was a proof of concept, but the idea extends to any sort of social network: phone call records, healthcare records, academic sociological datasets, etc.

Here’s the paper.

Posted on April 6, 2009 at 6:51 AM

