Anonymization and the Law

Interesting paper: “Anonymization and Risk,” by Ira S. Rubinstein and Woodrow Hartzog:

Abstract: Perfect anonymization of data sets has failed. But the process of protecting data subjects in shared information remains integral to privacy practice and policy. While the deidentification debate has been vigorous and productive, there is no clear direction for policy. As a result, the law has been slow to adapt a holistic approach to protecting data subjects when data sets are released to others. Currently, the law is focused on whether an individual can be identified within a given set. We argue that the better locus of data release policy is on the process of minimizing the risk of reidentification and sensitive attribute disclosure. Process-based data release policy, which resembles the law of data security, will help us move past the limitations of focusing on whether data sets have been “anonymized.” It draws upon different tactics to protect the privacy of data subjects, including accurate deidentification rhetoric, contracts prohibiting reidentification and sensitive attribute disclosure, data enclaves, and query-based strategies to match required protections with the level of risk. By focusing on process, data release policy can better balance privacy and utility where nearly all data exchanges carry some risk.

Posted on July 11, 2016 at 6:31 AM2 Comments


Arclight July 11, 2016 10:30 AM

One big problem with accumulating, storing and trading large data sets containing sensitive personal information is the fact that its likely to experience “scope creep.”.

The most obvious threat is legal attack. If someone can tie the data to an identifiable person, it is subject to discovery and subpeona by someone somewhere. Look at the recent ” 23 and Me” DNA match requests by law enforcement for a good example. And even if the data is stored is some difficult jurisdiction, it’s still stored in a jurisdiction where criminal and civil process of some sort exists.

The second has to do with the fact that successor parties will likely do whatever they want with your data if the company that collected it goes out of business, is acquired, or resells it. There is no meaningful concept of “informed consent” when the user-agreement contains third-part disclosure rights and “subject to change at any time” clauses. And all of these contracts are inherently coercive – such as the contracts healthcare providers require us to sign in or to get needed care, or the collection and encumbrance-free distribution of the tattoo database from prisons.

With these failures in mind, the only real solution is to not have data on-hand in hr first place. Don’t keep subpeonable information around in a form your company can access, and standardize consent forms to only include narrow authorizations. This is how real estate contracts work in the U.S.

JG4 July 13, 2016 5:01 AM

this seems to be on point

Anonymity networks protect people living under repressive regimes from surveillance of their Internet use. But the recent discovery of vulnerabilities in the most popular of these networks — Tor — has prompted computer scientists to try to come up with more secure anonymity schemes.

At the Privacy Enhancing Technologies Symposium in July, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory and the École Polytechnique Fédérale de Lausanne will present a new anonymity scheme that provides strong security guarantees but uses bandwidth much more efficiently than its predecessors. In experiments, the researchers’ system required only one-tenth as much time as similarly secure experimental systems to transfer a large file between anonymous users.

Leave a comment


Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via

Sidebar photo of Bruce Schneier by Joe MacInnis.