Data Privacy: The Facts of Life
By Bruce Schneier
As networking sites become more ubiquitous, it is long past the time to look at the types of data we put on those sites. We're using social networking websites for more private and more intimate interactions, often without thinking through the privacy implications of what we're doing.
The issues are hard and the solutions to them harder still, but I'm seeing a lot of confusion in even forming the questions.
Social networking sites deal with several different types of user data, and it's essential to separate them.
To start that conversation, here is my taxonomy of social networking data.
1. Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age and your credit card number.
2. Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments and so on.
3. Entrusted data is what you post on other people's pages: comments on other people's entries, photographs, comments, and so on. Its basically the same stuff as disclosed data, but the difference is that you don't have control over the data once you post it -- another user does.
4. Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes. Again, it's basically the same stuff as disclosed data, but the difference is that not only do you not have control over it, you didn't create it in the first place. And you might not even know it exists.
5. Behavioural data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play -- and how much time you are spending playing them -- topics you write about, news articles you access, links you click on, ads you respond to, and so on.
6. Derived data is data about you that is derived from all the other data. This is what social networking sites use to predict who else on the site you know. It can also be used to predict things about you that you don't publicise.
For example, if 80 per cent of your friends live in Cork, you probably live there too. Or, more intimately, if 80 per cent of your friends self-identify as gay, you're probably gay yourself.
There are other ways to look at user data.
Some of it you give to the social networking site in confidence, expecting the site to safeguard the data.
Some of it you publish openly and others use it to find you.
And some of it you share only within an enumerated circle of other users.
At the receiving end, social networking sites can make money from all of it, generally by selling targeted advertising or reselling it to information brokers.
You can also look at the data in terms of control. Some of this data you expect to have complete control over: who sees it, when it gets deleted, etc. Some of it you don't expect to have control over.
Of course, a lot of data that you expect to have control over you actually don't. Your Facebook data probably exists on servers in the US; did you ever stop to think about what US laws, or lack thereof, regulate the use and reuse of your data?
Different social networking sites give users different rights for each data type. Some are always private, some can be made private, and some are always public.
Some can be edited or deleted -- I know one site that allows entrusted data to be edited or deleted within a 24-hour period -- and some cannot. Some can be viewed and some cannot.
You probably agreed to all of this when you agreed to the terms of service for your social networking site. Not that you actually read the terms; no one does.
It's also clear that users should have different rights with respect to each data type. We should be allowed to export, change, and delete disclosed data even if the social networking sites don't want us to.
It's less clear what rights we have for entrusted data -- and far less clear for incidental data. If you post pictures from a party with me in them, can I demand you remove those pictures -- or at least blur out my face? (Go look up the conviction of three Google executives in an Italian court over a YouTube video).
And what about behavioural data? It's frequently a critical part of a social networking site's business model.
We often don't mind if a site uses it to target advertisements, but are less sanguine when it sells data to third parties.
As we continue our conversations about what sorts of fundamental rights people have with respect to their data, and more countries contemplate regulation on social networking sites and user data, it will be important to keep this taxonomy in mind. The sorts of things that would be suitable for one type of data might be completely unworkable and inappropriate for another.
Schneier.com is a personal website. Opinions expressed are not necessarily those of Co3 Systems, Inc.