Schneier on Security
A blog covering security and security technology.
« Attack Trends: 2004 and 2005 |
| U.S. Medical Privacy Law Gutted »
June 7, 2005
Accuracy of Commercial Data Brokers
PrivacyActivism has released a study of ChoicePoint and Acxiom, two of the U.S.'s largest data brokers. The study looks at accuracy of information and responsiveness to requests for reports.
It doesn't look good.
From the press release:
100% of the eleven participants in the study discovered errors in background check reports provided by ChoicePoint. The majority of participants found errors in even the most basic biographical information: name, social security number, address and phone number (in 67% of Acxiom reports, 73% of ChoicePoint reports). Moreover, over 40% of participants did not receive their reports from Acxiom -- and the ones who did had to wait an average of three months from the time they requested their information until they
I spoke with Deborah Pierce, the Executive Director of PrivacyActivism. She made a couple of interesting points.
First, it was very difficult for them to find a legal way to do this study. There are no mechanisms for any kind of oversight of the industry. They had to find companies who were doing background checks on employees anyway, and who felt that participating in this study with PrivacyActivism was important. Then those companies asked their employees if they wanted to anonymously participate in the study.
Second, they were surprised at just how bad the data is. The most shocking error was that two people out of eleven were listed as corporate directors of companies that they had never heard of. This can't possibly be statistically meaningful, but it is certainly scary.
Posted on June 7, 2005 at 7:45 AM
• 20 Comments
To receive these entries once a month by e-mail, sign up for the Crypto-Gram Newsletter.
Identity theft was yesterday. Identity manipulation is tomorrow. Hacking into databases and slightly changing anything you can put your hands on, without immediate notice, transforming all these data collections into worthless random crap.
Ten million U.S. presidents? No problem. Your face on a "Wanted - Dead or Alive" poster? Yeah! Always wished for a 1 billion $$ credit line? Get the money, make your day.
"The most shocking error was that two people out of eleven were listed as corporate directors of companies that they had never heard of" -- This is easily explainable. There are millions of web sites where owners request personal data from you, for no reason. Almost always there is also a drop-down "profession" or something where the first choice is "CEO". All these databases that import these responses have certianly a lot of George Bushes which are CEOs of all possible companies. Even if somebody cleans George Bushes, a lot of other names remain.
"they were surprised at just how bad the data is."
As AC mentions above this is something I certainly apply in doing when filling out any web based marketing data.
Unfortunately for data aggregation systems they do not check this data since that would be too expensive and just end up compiling it as a matter of "fact". This gets pulled into the big machine and then also gets treated as a matter of "fact". There isn't a cryptographic signing process on who is responsible for these "facts" but that would be nice. Of course then things may become too expensive for these data monoliths as they would be nailed to the wall with lawsuits from citizens tolerating their inaccuracies no more.
Above Secure mentions that ID manipulation is tomorrow... except ID manipulation has been around long before, only now to expose itself by technology.
"There are millions of web sites where owners request personal data from you, for no reason. Almost always there is also a drop-down 'profession' or something where the first choice is 'CEO.'"
Interesting point. Thank you.
Why should anyone really be surprised at how bad the data is? It's in the nature of a background check that there's minimal cross-checking with the person best able to verify/proofread, and the information also typically isn't going to be used for address/phone database purposes. And no liability. Hence, no real checking.
And, of course, as with credit reports, there's an incentive to report as much information as possible, whether it's factual or not.
The only good thing about this is that such blatant errors may make some of the reports more difficult to use for identity theft.
We all occasionally use variations of our name, even inadvertently. THese get into databases, and acquire a life of their own. There is actually a motivation for keeping customer-lists 'dirty'. A friend was once contracted to do a merge/purge on a large corporate database run by a utility company, so that if it contained duplicate names [example John Andrew Doe and John A. Doe] for the same address these were reduced to one. He stripped the database down by about 750,000 entries - but was then chastized by Investor Relations because his activities made the company appear to have suffered a dramatic shrinkage in its customer-base that month. The merge/purge was reversed, Investor Relations were happy, and the company's customers continued to get multiple mailings addressed to minor variations of the same name/address.
We should also ask whether it's better that our personal information stored by third parties be accurate or unreliable. Totally random data is equivalent to no data, after all. By that token, if you believe that "no data about people held by databases" is preferable to "data about people held by databases" then it stands to reason that more inaccurate data is better than more accurate data.
Why have none of these companies ever been sued for libel?
I remember hearing of a man in Canada who's background check said that he was a member of a bilker gang... before he was born. He spent years trying to have it fixed. I never understood why he didn't just sue for damages.
My own credit report doesn't even have my birthdate correct. Considering I'd unlikely ever get my own birthdate wrong, and my birthdate hasn't changed (I hope!) and I have a great credit history.
With such bad information, it never ceases to amaze me that they can sell this over and over. I guess people don't care as long as everyone else does it that way, which is how we got into the Windows security nightmares, etc. (Nobody gets fired for buying MSFT, IBM or Experian.)
The DnB database is no better. For one thing, if you buy 1,000 times with cash, you get no "credit." But miss a payment once, and you get labeled. If you have enough money to buy without borrowing, you are considered a poor credit risk until you prove you are too poor to buy with cash and instead need to borrow it and then repay it.
As far as I understand the US system actually encourages gathering and selling information about people. Maybe their reasoning is that the more information is traded the more frequently databases (such as ChoicePoint's) get updated and, in theory, the more accurate the information in them should be.
I don't think the 'CEO' drop-down box explains the data - that wouldn't associate a company name with the person.
Here in New Zealand, a law was recently passed which forces the ChoicePoints of this country to supply this information to you free of charge on demand. I did this - they had my name, date of birth, last 3 addresses, and all they had to say about my credit history is "we have nothing negative to report". There was also a list of half a dozen times my report had been accessed - nothing unexpected - mortgage, credit card applications etc.
This was all done by post. As I recall, I had to send a photocopy of my drivers license to 'prove' who I was.
I used to work for a subsidiary
of Seisint, a smaller player in this arena, when they were Naviant. I have seen the raw data and participated in "clean up projects" for the master data bases. You would be surprised at how poor this information quality is and also how disinterested management industry-wide is in acknowledging or dealing with this reality.
Filias Cupio: "I don't think the 'CEO' drop-down box explains the data - that wouldn't associate a company name with the person"
The next field on the same forms is "company name", of course. The forms I talk about are all these internet sites that request you to register in order to read some article. As I know that I certainly wouldn't visit that site soon again, I put a name of somebody, CEO (minimal work to select), some company name and select some lant where I never was. Oh, yes, where they ask for yearly income, I always select extremes.
I realize this is contributing to the problem, but I refuse to give out updated or corrected information when these companies call me. I think it is a good personal policy not to give out sensitive information over the phone, especially when the call is unsolicited.
When Mr. Schneier spoke at the Information Security Decisions conference in Chicago last month, he described the economic concept of negative externalities (http://www.google.com/search?q=define%3Anegative+externalities) as it applies to banking security. I think the same concept applies to the correctness of aggregated identity and financial data. The aggregator is not harmed if the collected information is inaccurate, so it has no incentive to improve its data's accuracy.
You don't seem to understand the externality concept.
For those in the transaction, bad data are a quality issue. If ChoicePoint customers are paying for high quality data and getting low quality data that is simply buyer beware.
Accurate data in this case have strong negative externalities. IF your SSN is correct there is ID theft. IF your phone number is correct there is now and willbe more solicitation calls. You want Choice Point to have a correct SSN?
There is a negative cost to the subject of the data (PROB OF HARM*AMT OF HARM) that is not paid by Choice Point. By providing bogus information the subject of the data reduces the negative externality to the extent that the record does not correspond to the person.
Unlike traditional credit records, there is little that the subjects of Choice Point want from the data purchasers. Choice Point is for bulk low-margin identity-intensive business. The _existence_ of the business model of Choice Point may be to extract value by creating a negative externality. That might be an interesting argument. But data quality? Choice Point cannot deny you credit - CRA do that. In a job application I may be required to sign permission to allow a CRA check.
There is a detailed explanation of computer vulnerabilities as negative externalities here:
"... I put a name of somebody, CEO (minimal work to select), some company name and select some lant where I never was. Oh, yes, where they ask for yearly income, I always select extremes."
Of course, if you feel (as I do) that degrading these databases is a valuable public service, you should make a little more effort. Apparently Acxiom and CheckPoint have totally cavalier attitudes to cleaning their data, but likely the raw data exists forever so there is always the possibility someone else will take more trouble in future. Extreme incomes are the sort of thing that a semi-intelligent search algorithm might use as a clue to possibly unreliable records.
So I do one of three things:
a) Put down a fairly common name, with random but plausible data; or
b) Occasionally, I put the name of a real person and their real company, followed by either i) slightly flattering false data, if I have no opinion of the person, or ii) derogatory false data, if I got the name while complaining about bad service... ; or
c) Very occasionally, I do put in a ridiculous one, just to leave everyone in no doubt that the database is junk. Something like a 16 year old female company CEO from Saudi Arabia, with a name of "Donald Fauntleroy Duck".
It is certainly scary that nobody can seem to be getting it right.
But as a privacy&anonymity-aware citizen, I kind of like what I'm hearing here.
I would be much less comfortable with a service that could with 100% accuracy provide anyone any information on me, than a service that might provide information that discredits me ( true or not ).
Atleast in the latter case, I can point to these scary statistics as an excuse and/or explanation of the (mis)information.
Interesting point. Thank you.
Here's how DnB works: because their automated systems aren't able to establish any information on you (good or bad), they lower your credit worthiness score from time to time and then in their details (in small print) record that they have no negative information at all to support their lowering of your credit. In fact they have no new information at all to support the change.
So in reality, a non-information situation causes them to lower your credit score from a previous time period. That might be reasonable were it not for the directly related next action they take:
Then they hire some people to call you (probably on commission operating under no rules as to what they should be saying) to fabricate scary sound situations to try to get you to call them, because (obviously) they don't have any new information to point to your lowered score. But whatever ratio of actual new information to actual facts they have must be so low, that their attempt to churn up some evidence they are actually doing something for the service they supposedly provide, they have to fabricate changes to appear like they are doing something, and then use scare tactics to generate activity to cover their tracks of no new information. The minute you call them back you are (of course) generating new information (the presence of your call back) for them to record.
They use scare tactics on the phone like:
DnB caller: "because of all the inquiries taking place on your report ..." and .... "because there were some low scores" on your report. The first item likely completely untrue or a minimum grossly exagerated. The second item is 'true' in that its true that they changed it anyway (lowered it) based upon no material change of any information they have.
So in reality what is going on here, given the reality of their ethics, or lack thereof, is:
DnB Caller: "please call us, so that we can cover up our unethical practices" ... "we've jacked with your scores because its a good way to improve our ratios of informational business metrics manhood" ... "and now we want to scare you into calling us because" ... "together, you can help us cover our unethical tracks so no one questions how business practice" ... "and by the way, we won't be changing your score now that you've called us, but we thank you for helping us out".
What a scam DnB is. They should be ashamed.
Word is apparently getting out now on Dun & Bradstreet... I was never given any reason to think that they were anything but reputable until we got a call from them recently, trying to get us to sign up for some sort of credit monitoring service and the lady mentioned something about us paying some bills a day or two late.
At that point I realized they were a fraud because we ALWAYS pay our bills on time. We like to be paid on time, and believe everyone else deserves the same.
We were actually considering legal action until a few web searches turned up a wealth of similar stories on D&B scams. Since DnB is quickly making itself irrelevant with such tactics it's no longer a point of concern, much like the new messages spammers are peppering the landscape with, claiming that your credit score has recently changed.
Schneier.com is a personal website. Opinions expressed are not necessarily those of BT.