Schneier on Security
A blog covering security and security technology.
« Security Fog |
| Booby-trapping a PDF File »
April 22, 2010
NIST on Protecting Personally Identifiable Information
Just published: Special Publication (SP) 800-122, "Guide to Protecting the Confidentiality of Personally Identifiable Information (PII)."
It's 60 pages long; I haven't read it.
Posted on April 22, 2010 at 6:19 AM
• 28 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
It's not bad. My problem has always been with the definition of PII.
I can develop controls on your system based on the information in it but this is not that helpful...
"any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information."
Is this a discrete list of elements or an aggregate? How discrete are the data elements to be protected?
This matters as systems today share Ts of data and OMB policy is to class any system with PII in it as a minimum of Moderate Impact.
Okay. My name. My name is a matter of public record (mis-spent youth, mistakes were made, let's move on)
Its in the phone book, it's on my birth certificate, my tax deliquencies, my deeds all avaliable at the courthouse (now on line). How do you reconcile public records with a moderate level of confidentiality?
So say I have a public webserver to present agency information to the citizenry. NIST FIPS says you can class that as Confidentiality - N/A. (the only case where there is an N/A). But say I put a comment field in with the ability for the citizen to request a follow up. Instantly my impact level is in doubt because many people will put their name in the system and their contact information. Say they only sign their name to their comment. Same thing. I now have to protect the PII. The cost between a Low and a Moderate system in terms of number of controls is significant.
Moderate, among others things, requires encryption, unique user id's, auditing (a privacy conflict on webservers some argue). And while we're on the subject of webservers what about those W3SVC weblogs and audit records in general. If you've ever done forensics it's pretty easy for that information to be "linked or linkable to an individual". Some laws require those records be maintained anywhere from a year to forever.
The definition includes specific types of data and many agencies will limit themselves to those specific words. we had that with the privacy act. My first service phone book included the types of information (discrete and aggregate) that could be disclosed (like the service members name) and the type that could not be (home phone, and/or address). Note that under the Privacy act Names were not protected information. But Privacy Offices are being much more discriminating in their PIAs--except for EINSTEIN. But that's DHSs Privacy Office which is more like the DoJ OLC under the Bush II administration.
"It's 60 pages long; I haven't read it."
Shouldn't that have a "yet" on the end ;)
"Shouldn't that have a 'yet' on the end ;)"
It kind of depends on what you all say.
I guess it would take you less time to read the 60 pages than to read all the comments you'll eventually get here ;-)
How do you reconcile public records with a moderate level of confidentiality?
We've been worrying that question for years on this blog.
Well I plan to print it off and read it over the next few days. (I might modify the plan based on responses here!)
@ BF Skinner at April 22, 2010 6:55 AM
Interesting question about what is PII and how it needs to be protected.
My take on it is that there is a lot of PII about me in the wild ranging from entries in Regimental Journals to my inlaws putting my wedding pictures in the paper. It would be possible for a dedicated investigator to build a very detailed picture of my life.
However, this does not mean it should be made any easier.
When I give some form of personally identifiable information to the Government then I would expect them to protect it. There are numerous situations where you wont want your interaction with a Government department to be made public knowledge so they *should* have measures in place to protect this data.
Without having read the SP, I feel that your example of a comment on a blog is different. Although it was ingenous of my parents to name me GreenSquirrel, I could just as easily be using a false name.
If I have misread and you are talking about user submitted requests which need a response (and therefore need to provide correct contact details) then these can either not be posted to the web or have the PII stripped off first?
Have I missed something?
Skip the appendices and the introductory waffle and you're down to under 30 pages.
It sure would be nice if people would choose a term for this information that makes some kind of sense. What does "personally identifiable" mean? Do these people speak English?
A term that might actually work is "identifying personal information", or just "personal information" for short.
Look in the Privacy Act. You won't find "PII".
I went straight to section 4.2.1 which included the collection of PII. I was disappointed...it discussed the "minimum necessary" principle, in other words not collecting more than needed to limit the consequences of a data breach.
While I technically agree, an entity shouldn't have unnecessary private information on someone or incur the risk, I was hoping for some controls on making the information a bit harder to use. I didn't see it. Perhaps I will after I read the whole thing, but no other section seemed to indicate this.
It's basically working on strengthening the stronger link, as most do. They need to strengthen the weaker link-- PII really needs to be tougher to use or it will continue to be a problem.
I skimmed it. It's nothing new.
Most individual bits of personal data don't matter if they're separate. You can have my name, address, phone number, birthday, individual medical visit information, etc. but you don't really have anything useful until they're mashed together into one database, one record, or one sheet of paper that links them. The phone book, for example, works because it aggregates your name, address, and phone number.
There are certain bits of information that should always be kept private. Credit card numbers, bank account numbers, and social security numbers can all be abused without any other info.
@Derf: "Credit card numbers, bank account numbers, and social security numbers can all be abused without any other info."
Yes, and they all have two things in common:
1) There are always other people who have them and who they will be disclosed to, making them virtually impossible to protect.
2) They are far to easy to use.
When someone abused my wife's SSN to set up credit cards and services, it was obtained through the perpetrators legitimate employment. NIST wouldn't have saved her.
On the contrary, IMHO Social Security numbers should all be made public. They're terrible authenticators, violating pretty much every principle governing a good key: they're practically impossible to change, you have to give them to people you don't trust (phone companies, employers), and they don't have much entropy in the first place.
The only way to stop the continued abuse of SSNs is to make them all public so no institution that is using them as authenticators can pretend that they are anything but complete idiots, which they are--and which we are for allowing this nonsense to continue for so long.
@antibozo at April 22, 2010 9:03 AM
I think you make a point. Using an unchanging identifier as a authenticator is inherently problematic.
It would be like a business using a never changing user ID alone to authenticate people, and passing policies to protect the ID from disclosure. That would be doomed to fail.
@Hjohn "hoping for some controls on making the information a bit harder to use. I didn't see it"
When dealing with NIST we're dealing with a big wibbly wobbly conglomeration of laws, mandates, policy procedures and protocols. If you ask NIST about controls they'll point you to OMB requirements, your own organizations internal policy structure and 800-53. We have some law but for privacy it's mostly controlling what the Government can do viz it's citizens. There is nothing to prevent the data aggregators like Experian from collecting and maintaining records. The laws that do control them are consumer protections for accuracy. Their right to collect and sell is not questioned by most.
@wiredog "worrying that question for years "
Yes, I know. I have been in the room.
But I see two issues. First having control over information about me; then insisting protection of my private information in the custody of the government. The first issue assumes I own the information just because it's about me. That's a point yet to be settled. It may not be true.
The second issue is what PII is about and when we develop a system it instantly becomes 'What Information' 'What Controls' and applications developers are squirrly. They want to know in excruitating detail what is and is not a PII typified piece of information; alone or in aggregrate. After having dealt with a couple of breaches dealing with aggregate classified material, shared file servers and enterprise email systems, I have a lot sympathy with their view. Without defining, without typifying, information we can't mark it, without marking we can't prevent it from sailing past anysort of ACL or filter.
@GreenSquirle "comment on a blog is different...using a false name...user submitted requests which need a response (and therefore need to provide correct contact details) then these can either not be posted to the web or have the PII stripped off first?"
Actually to your latter point. There are two cases. One staight up blog like what Bruce hosts here. Second queries requiring a reply. But yes that does describe the double bind.
There are all kinds of weird specific rules. I think it may only be resolvable by use cases for each type of information (ugh). The President's webpage (with comments) as an example is actually subject to the Presidential Records Act and must be preserved (including the Spam) in the National Archive (forever). Although prudent people can use nom de plume's on a blog comment there will be people who don't and it's their privacy that suddenly must be addressed as soon as they give up their name. Informed consent is usually enough. Reasonable people understand they are volunteering their names as part of a permanent record. But this is the US and we have many unreasonable people too.
urrrrg. This is why I don't like the blog comment format. There are too many issues that intertwine to resolve in a paragraph or two. Privacy conflates the problem because it's not, I think, a general principal but very specific and unique to each individual. Thousands of ink trees must die before we can even skecth the outlines. To my mind the only one here who writes at length and stays interesting is Clive and thats mostly cause I sure he's gonna blow something up or tell us how to.
@Andre LePlume: For anyone familiar with NIST 800 series there are no new details.
@All: As someone who has the pleasure of being a service provider to the US Federal Government and submits to annual NIST-800-based security reviews from the agency we service, this document is helpful in understanding how NIST defines and controls for PII security. To me, 800-122 is an overview of/intro to 800-53 and puts much of the FIPS and NIST into context. It's a decent intro (with examples!) of how the 800 series works.
For anyone who is new to applying NIST standards to their organization and stores PII, this doc is "highly recommended."
This document focuses on the privacy perspective for data protection: conducting PIAs, de-identification, anonymizing (if that's truly possible), minimizing collection, etc.
Since there are many privacy laws in the US and elsewhere, I think they did a decent job of writing a document that has broad applicibility.
This document is a nice change for NIST. I think many of their other SPs get so caught up in the details of the controls they loose sight of the big picture.
The comments have some interesting points and look at the broader picture of PII, but as mentioned I think this is just an overview, and should be treated as such.
PII was defined as a term by OMB. This makes it an administrative rule binding on Federal agencies.
@Craig "just an overview, and should be treated as such"
If by overview you mean a guideline that all Federal Agencies are expected to adhere to in a normalized fashion that can be consistently audited by OMB and agency Inspector General offices. Okay.
But that's not how I define overview. Especially with an enviornment where they tend to be faithful to SPs down to the comma.
A closer reading has me now thinking "great a separate information type" whose impact level must be assessed differrently from all other information in the system.
The controls are Information Security program drawn small FOR PII.
Identify your asset
Identify it's impact at breach
Reduce your holdings so breaches won't hurt too much.
De-PII by removing data elements and dropping the PII radioactivity to an acceptable level.
Implement Security controls AC-3, 5, 6, 17, 21, 19; AU-2, 6,; IA-2; MP-2, 3, 4, 5,6; SC-9, 28; SI-4
Prepare for PII disclosure incidents with response and training plans
It does acknowledge that not all PII is the same. A SSN doesn't carry the same weight as a Zip Code. It doesn't go much farther in helping to determine the degree of difference.
It does give some criteria to apply to making PII impact decisions.
It ALSO puts the whole load of decision making on the organization using to decide what is/is not PII using Hotter/colder ways of "evaluating" impact.
This is a long argument with the IG second guessing that'll never end.
And I don't see anything about how it relates to the impact level of the system as a whole. I can forsee some agencies implementing pairs of Impact level per system.
Just because OMB had its head firmly wedged in the aftermath of the VA debacle doesn't mean that the rest of us have to follow suit, nor does it mean that the right people couldn't unwedge them, a key player being someone with the sense and visibility of Bruce.
I think part of the problem with protecting PII is the nature of politics too.
There are usually more votes to be gained by advocating even more measures against business, and holding business more accountable. Unfortunately, the effectiveness of this only goes so far.
On the flip side, as Schneier has stated and I agree with, the bigger problem now is PII is too easy to use. Of course, there are less votes to be won by making PII tougher to use. Any difficulty people get with using their PII may be their best protection, but that may not be the reaction.
I find the entire PII definition quite vague: "...any information that can be used to distinguish or trace an individual‘s identity..."
With the information that I a) am an EE grad student b) frequent Voxx Coffee, and c) like the movie Metropolis, it is quite possible to uniquely distinguish my identity.
The reality is that due to the heavy-tailed distribution of personal information (ANY information about a person), the set of distinguishing elements is quite large, and hence, PII could refer to any such information (under their definition).
So where do we draw the line?
The document seems to fail to mention in the definition that some of this information is published, readily available, or must be shared under other legal requirements--though this is addressed later in the document. Really, the document seems to want a formal classification for each information asset (before, rather then after the fact). That's going to be a pain for some organizations, but not surprising. However, the definition of PII is likely to "leak" into other people's definitions, causing confusion over what protections really are necessary and what really needs to be protected.
Their list of PII is as follows:
-Name, such as full name, maiden name, mother‘s maiden name, or alias
-Personal identification number, such as social security number (SSN), passport number, driver‘s license number, taxpayer identification number, patient identification number, and financial account or credit card number (Partial identifiers, such as the first few digits or the last few digits of SSNs, are also often considered PII because they are still nearly unique identifiers and are linked or linkable to a specific individual.)
-Address information, such as street address or email address
-Asset information, such as Internet Protocol (IP) or Media Access Control (MAC) address or other host-specific persistent static identifier that consistently links to a particular person or small, well- defined group of people
-Telephone numbers, including mobile, business, and personal numbers
-Personal characteristics, including photographic image (especially of face or other distinguishing characteristic), x-rays, fingerprints, or other biometric image or template data (e.g., retina scan, voice signature, facial geometry)
-Information identifying personally owned property, such as vehicle registration number or title number and related information
-Information about an individual that is linked or linkable to one of the above (e.g., date of birth, place of birth, race, religion, weight, activities, geographical indicators, employment information, medical information, education information, financial information).
@antibozo "mean that the rest of us have to follow suit,"
If by "us" you mean the part of the world that is neither Federal Government, and its contractors. It does not. Let's see...That's only the Departments of Justice, Commerce, Defense, Homeland Security, Interior, Treasury, Agriculture, Labor, Health and Human services, Housing and Urban Development, Transportation, Education, and (as you note) Veteran's Affairs. Approximately 2million people, and 4T overall budget; then there are the Independent Agencies and Government Corporations like AMTRAK, General Service Administration, CIA etc. Pretty much anyone in the Fed that isn't Congress or the Courts.
THEN there are the people who have chosen to adopt and be ruled by the standards and guidelines voluntarily.
So yes if that's what you mean by "you" yeah, no one ever said you did. Or did you take Paulger's advice to skip the "introductory waffle". Shoot...you don't even have to have a password on your bank account if you don't want to...no, wait.
Given the way Federal gov't security programs gravity deforms the space around you'll need to swim fast to get out of the gravity well.
@Hjohn "more votes to be gained by advocating even more measures against business"
Yes except it doesn't happen.
Scanning back through this thread it appears to be peoples belief that business's are expected to meet some sort of legal obligation. Beyond torts 'taint so.
Is there a misconception here that the NIST standards and guidelines are binding on buisness or private citizens? Beyond leakage due to contracts cited in my post to antibozo it don't.
They can collect, process, disseminate, sell information to anyone pretty much. Including Sherrifs and Feebies. Why else do aggregator sites exist? You can't be a peeping Tom and you can't intercept some types of communication legally.
But beyond that what you rent at a video store is information belonging to them. Make a sex tape and lose it in the break up with your SO and you're SOL.
The Privacy Act? Applies only to the government--federal at that. Did you think it protects you against Experian? Go re-read it.
Why do you think Europe has such heart burn about putting their citizens into the no-fly list? Because the US has flimsy privacy rules.
@ BF Skinner
To support your point: Lexis/Nexis; Total Information Awareness; Google's whole business model; Paris Hilton's sex tape. Case for determining privacy: closed. ;)
I noticed you didn't mention the names of anyone involved in writing the publication. Well played.
@JBB "...didn't mention the names of anyone involved in writing the publication."
Exactly why is this important?
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.