Database of 12 Million Apple UDIDs Leaked

In this story, we learn that hackers got their hands on a database of 12 million Apple Unique Device Identifiers (UDIDs) by hacking an FBI laptop.

My question isn’t about the hack, but about the data. Why does an FBI agent have user identification information about 12 million iPhone users on his laptop? And how did the FBI get their hands on this data in the first place?

For its part, the FBI denies everything:

In a statement released Tuesday afternoon, the FBI said, “The FBI is aware of published reports alleging that an FBI laptop was compromised and private data regarding Apple UDIDs was exposed. At this time there is no evidence indicating that an FBI laptop was compromised or that the FBI either sought or obtained this data.”

Apple also denies giving the database to the FBI.

Okay, so where did the database come from? And are there really 12 million, or only one million?

EDITED TO ADD (9/12): A company called BlueToad is the source of the leak.

If you’ve been hacked, you’re not going to be informed:

DeHart said his firm would not be contacting individual consumers to notify them that their information had been compromised, instead leaving it up to individual publishers to contact readers as they see fit.

Posted on September 6, 2012 at 6:48 AM41 Comments

Comments

bkd69 September 6, 2012 7:23 AM

My understanding is this:

A) the file is 1.5 miliion UDIDs and some ancillary device information, scrubbed of any personally identifying data, such as names, emails, phone numbers and addresses, out of the more complete 12 million entry file.

B) The ‘NCFTA’ that leads into the filename has lead most to believe that the file went through a third party, namely, these guys:
http://www.ncfta.net/
Which makes Apple’s denials of being hacked eminently plausible, and nobody’s going to believe the FBI’s denials anyway.

C) Popular speculation lies mostly on the side of a leaky app (though in the modern age, apps are supposedly forbidden from accessing such info, at least if they’re being distributed through the app store), as opposed to any actual hackerage, but a more intriguing possibility I’ve seen mentioned is that the hacking story is a cover to protect a potential whistleblower that leaked the file.

DV Henkel-Wallace September 6, 2012 7:50 AM

According to Marco Arment, like mkd69 it most likely is NOT Apple’s data. Also there’s no reason to believe the FBI did or did not have it, so we didn’t learn that.

But the other interesting thing is that Apple is getting rid of these UDIDs for privacy reasons (they say, and I can’t see any other reason to do so). This announcement was greeted with much wailing and gnashing of teeth by many developers when announced, but it seems Apple is vindicated here as being the good guy.

Zombie John September 6, 2012 7:53 AM

That the FBI would deny having the data or being hacked – sure I believe that. I don’t believe the denial itself.

For its entire history, the FBI has created dossiers on “subversives” like Elvis and John Lennon. Why would we begin to question that they have all kinds of data on all kinds of people?

vwm September 6, 2012 8:19 AM

Anybody found his phone on the list, already? Somehow I am missing people complaining about that in the news-coverage.

After all, it is not exactly difficult to generate (or to pad) such a list with bogus information…

P. T. bailey September 6, 2012 8:34 AM

@vwm, there’s an app for that (well at least one web app) so that you can give your valid UDID to some stranger to check the list for you….

vasiliy pupkin September 6, 2012 9:03 AM

Q#1: Do we have in the USA any Federal Law/regulation that regulates ‘blacklisting’ of citizens/legal residence by any Federal Law Enforcement Agency?
Asking because there is no way to fix wrongly placed on those list or challenge it in the Court through due process.
Q#2: I understood reasons for data sharing regarding criminal activity within Law enforcement alphabetical soup, but why personal (not public) information provided to the Federal Agency is shared with private companies, e.g. information provided to USPS when you change address, information provided to Department of State when you apply to US passport?
What is the legal base for that? Who is responsible/liable for leakage of such personal information?

Alex September 6, 2012 9:15 AM

Check here – http://pastehtml.com/udid

I checked some iDevice UDIDs there. One was on the list with correct device name recorded. Some 400 apps have been installed on that particular device at one time or another (used for evaluating apps) so finding a leaky culprit from that will be next to impossible.

0.25% chance of your device being on that released list – 1 million out of 400 million iDevices that appeared on the list.

John L September 6, 2012 9:33 AM

If I were anti-sec, I’d gotten this dump from a hacked app or insecure server, and I wanted to hassle the FBI, I’d make up a fake story about a stolen laptop, too.

Lacking any plausible reason for the FBI to have this stuff in the first place, which is more likely, a secret FBI operation by an agent who loses an unencrypted laptop, or anti-sec messing with the press?

Azzaezel September 6, 2012 9:44 AM

It’s worth noting that antisec doesn’t have a history of lying. In fact have they ever willfully falsified a hack? Sometimes people have said stuff on their behalf who weren’t them, but they aren’t known to lie.

NobodySpecial September 6, 2012 9:49 AM

which is more likely, a secret FBI operation by an agent who loses an unencrypted laptop

(hand==up) please miss, I know the answer to this one ….

guest September 6, 2012 9:56 AM

Bruce thats not really surprising and we know why… government have much more than 12 million stuff. all enterprise size companies such as google facebook apple etc share everything they can get from visitors with a very powerful interface with government specially facebook

Steve September 6, 2012 10:25 AM

…all enterprise size companies…very powerful interface with government…

Absolutely. I’m sure that’s right. Oh, and please pass the tinfoil hats, mine has a small hole in it.

Glyndwr Michael September 6, 2012 10:26 AM

I hope this is a wake up call for the people who think that only people with something to hide are being tracked and watched by the government. We all are, all the time.

I want to see what they have in their Google/Android folder.

Clive Robinson September 6, 2012 10:36 AM

Perhaps people should ask “of what value is the UDID” and secondly why Apple say they are going to get rid of it for “privacy reasons”…

Now one obvious reason for the first is “tracking”, but the less obvious one is potentialy “embedding”. Hands up those who remember MS using the Ethernet Card MAC number to embed in Office Documents? Also we know that many devices use a “serial number” or similar “unique ID” as an “entropy” item for generating certificates etc. And others such as RSA using them in a database to index secret “seed” information. So conceviably it’s possible it could act as an “unlock code” for encryption keys…

So there might be a lot more behind this than at first meets the eye…

Unless someone gets realy down and dirty with the machine code on Apple iDevices OS’s then we are left in a slight state of paranoia the intensity of which is dependent on just what you have used your iDevice for…

bartender September 6, 2012 10:44 AM

According to some sources the list also contained users names and address details, so not just the UIDs.

But in any case it could be that someone at FBI moonlights as a member of Anonymous/Antisec.

Scryptkeeper September 6, 2012 11:15 AM

A little disappointed here. Maybe the FBI had it, maybe they didn’t. There is no real evidence one way or the other. But just because they SAY that’s where they got it, you take it as gospel. Thought you were better than that.

Jay Bomber September 6, 2012 11:23 AM

What better way to steal from the public then to watch all data. Don’t talk about any business ideas over anything the next thing you know some other business is making it. All about the money not crime. They want to know what’s hot what peoples buying and what’s going to be the. Next big thing data mining been going on for years

Jaybomber September 6, 2012 11:30 AM

They will be back at my house anytime now….left the net for 8months because of there last visit. We do not forgive we do not forget first post back

bartender September 6, 2012 11:33 AM

BTW apropos “enterprise size companies” sharing stuff with government…

Not sure about other governments (except Sweden about which can be read here: http://www.redicecreations.com/article.php?id=4076) but in USA the Fusion Centers use software made by IBM and others to access social networks and other online communication. The data retrieved from the social networks is used to create links of people, such as if you have a suspect and you need to know whom does he/she associate with.

This sort of system does not target those who alreday are “social deviants” somehow but the basic concept is that anyone who is a “known good” person today may not be that tomorrow.

The current system also includes software that is able to serialize voice into a text searchable format quite well, and this is used to pick up keywords from telephone conversations. This is all fairly user friendly so it is not like they are extremely cumbersome to use; after all, the intention is to create a system were mass death can be avoided.

It is just the result of an ongoing improvement of what has been done already for years already.

Ping-Che Chen September 6, 2012 3:56 PM

Since Apple deprecated the API call for getting UDID, almost everyone started using MAC address (which is very easy to get in iOS) instead. For the general purpose of identifying a device (such as ‘identifying’ an in-App purchase, or similar usage for DRM) it’s quite good enough.

The problem is that many people used UDID for other purpose. There are reports that some services allow one to login with UDID alone (or at least being able to do some operations, such as sending a message). This makes a leak of UDID potentially dangerous.

Apple promised to provide a better way for device identification (I’m guessing something like App dependent ID, i.e. an unique ID which is different for each App, probably allowing the user to give separate permission for each App to get such information), but that will have to wait till iOS 6.

Gweihir September 6, 2012 6:12 PM

@Michael Hampton: Nice. I was not aware this has gotten a real definition by now. Of course the best lies are by misdirection with completely truthful statements.

Many, many people (and most journalists, I think) do not have the mental equipment to realize that checking details sometimes does not cut it.

not square September 6, 2012 10:26 PM

leaks happen so the gubmint gets their goods without any blackops, sneak n peek, pretexting walking through the building(s), and more. thankfully i dont use any shit device like are being pushed to the masses. ID my Linux box, bitch!

Duke September 6, 2012 10:46 PM

Perhaps the ncfta (ngo?) got the udids through some commercial agreement with apple and then sold or gave them to the FBI.

Structuring acquisition in that manner may have significant 4th amendment benefits for the FBI. It may lend support to the perceived constitutionality of subsequent investigations/procesutions based in part on udids or evidence gathered by Leo exploits of udids.

Bob September 7, 2012 8:37 AM

I’ve downloaded the list and found 2 of my sister’s idevices to have correct UDIDs listed.

To everyone saying it was a ‘stolen’ laptop, I think that it was not stolen but antisec alleged to have pulled a Java AtomicReferenceArray attack on the laptop — presumably remotely, because why would one need to do this if they’d stolen the laptop?

This raises an interesting question in my mind; How the hell does an FBI laptop get compromised with something so weak and amateur as an AtomicReferenceArray vulnerability? Even ‘Avast!’, the free antivirus, protects against this attack! WTF, FBI? I don’t actually believe that it could have been FBI for this very reason. At least, the laptop couldn’t have belonged to the FBI… Maybe it was an FBI employee’s private laptop?

bob September 7, 2012 8:44 AM

Why does this kind of stuff need to be on a laptop in the first place [not just FBI but ANYBODY – govt agencies, banks, Radio Shack]?

What, he was on extended sick leave and took some work home to catch up? Gonna edit 12,000,000 records that night and bring em in finished the next day?

How about check out 500 on a trucecrypt drive, take them home and work, then return next day and reintegrate into the master; keep the master database in a locked room at the office on a big bulky hollywood-style tape drive that wont fit in a full-size van.

They should treat this stuff like they do firearms [except for the “leave it in a public restroom ” part.]

Mute September 7, 2012 8:55 AM

Apple UDID’s are fairly harmless (in the scheme of data available and privacy protection) and have been available before via apps, one purpose being for commercial reasons.

What this more likely is, is a Fed. honeypot.

What better means of pushing for regulation or change than creating a reason, a quantifiable reason easily understood by non-technical parties?

What better avenue to create a means than an “anonymous” group where no one is held accountable? We’re using an idea, a popular idea, to create a situation that we can exploit for change. this isn’t some evil master scheme where the NSA is tracking everything little packet that’s about you

(I’d say it’s still undetermined whether or not Antisec/Anonymous is actually the hand of the Govt (in this or other case(s)) or an actual cell that stumbled upon a honeypot)

Believe it or not, the Feds aren’t dumb especially in terms of cyber policy. How was the laptop infected? Through the FBI’s internal network? I doubt it. Anonymous isn’t exactly on par with a nation-state. And a DoJ/DHS/NSA etc, network (GIG or otherwise) isn’t something to be trifled with.

Why would the laptop with sensitive data be located outside of a secure network? This isn’t the private sector. These people know what to do and what not to do and the consequences of a flagrant break in policy.


The largest challenge to cyber law is explaining, to the dinosaurs voting on policy, the importance and need for new law and regulation.

Events like this are perfect.

Apple + Government Negligence + Privacy Concerns = A great story that’s publicized everywhere with little actual damaging footprint.

Remember, regulation has to be justified by, first, being quantified.

Quantification has been the largest obstacle in Cyber.

There needs to be a variable for the impact of a cyber event in terms of cost and impact on the nation. How do you put a $ to the loss of Intellectual Property? It would be astounding.

Also, a government agency that is doing it’s job without a problem does not get more funding.

ChoppedBroccoli September 7, 2012 12:12 PM

If there was some way to get the installed app set for every affected uid and the intersection of all these sets was taken, the list of possible rogue apps could be pared down significantly. This only works if the rogue app is a 3rd party app and not installed by default of course.

Gordon September 7, 2012 2:29 PM

bob,

There is every indication that somebody did treat this data with the same level of care and respect as firearms generally get in the US.

Roger September 8, 2012 10:45 AM

@Alasdair:
Interesting analysis. I made a few observations myself, which you might find interesting:

  1. If this data is supposed to be anonymised, it has been badly botched. There are hundreds of complete names in there, some partial addresses, and thousands of complete email addresses.
  2. TThis recognisable data is mainly from the USA, but from all parts. There are also some possibly Spanish and Italian addresses.
  3. As noted, there are 1,000,001 rows, and no header row. We have to guess the meaning of the fields.
  4. Most rows are formed of 4 fields, in which the first consists of 40 hexadecimal characters surrounded by single quotes; the second consists of 64 hexadecimal characters surrounded by single quotes; the third is a free-form field that appears to be what users type in as a welcome text (ranging from default values to addresses where to return it if found); and the 4th is a device type (iPhone, iPad or iPod touch.)
  5. The file is not well-formed as a CSV file. The most serious issue is that 1139 lines have only 3 fields when most have 4. The malformed lines are all ones that had a comma embedded in the free-form string. For these rows, what would have otherwise been the 4th field is missing. So, Antisec is not so good at parsing CSV. (Don’t worry, Antisec: it’s harder than it looks.)
  6. 30 lines are formed quite differently to all the rest. In one case, all the fields just contain the string ‘disabled’. In the other 29, the fields are rather differently structured. Instead of the first field being the 40 hex chars, it starts with the 64 hex char field. The device type is 3rd, and the 4th field becomes what looks like a version number. Most interesting, though, is the 2nd field.
  7. In those 29 odd records, the 2nd field is a short string always starting with the numeral 2. In most cases they are exactly 5 characters long; one is 6 and one is 11. They look a lot like decrypted passwords. There is significant repetition. (This would be consistent with hits from a rainbow table.)
  8. There are many repeated values. For the first field (the 160 bit one) there are over 13,000 values that appear more than one, hundreds that appear 3 times, and even one that appears 53 times. A similar (bit not identical) pattern occurs for the second (256 bit) field. The third field is not surprisingly more diverse, but still has immense repetition. Including a surprisingly large number of values that appeared hundreds of times. For example, PdaTX.NET occurs 1,141 times.
  9. Searching for the string “student” is interesting. It seems that quite a few of these devices are shared machines on campuses — including religious colleges.

Roger September 9, 2012 2:28 AM

I just ran the infamous fully-RFC 2822-compliant regex over it, and counted 582 valid email addresses. Of these, 434 are .com domains, and 48 are other TLDs (including 2 from .gov — one from DOJ!) Of course the rest are country codes. The top 10 .com subdomains are:

  1. gmail 118
  2. hotmail 94
  3. yahoo 75
  4. aol 21
  5. me 19
  6. msn 9
  7. live 7
  8. mac 6
  9. 163 6
  10. naver 4
  11. and 61 others.

It seems that 163.com is a popular Chinese portal, and naver.com is the most popular portal in Korea. Which brings us to the top 12 country codes:

  1. ru 15
  2. it 8
  3. ca 7
  4. cn 7
  5. jp 5
  6. nl 4
  7. au 4
  8. de 4
  9. hk 4
  10. hu 3
  11. us 3
  12. br 3
  13. and 26 others.

Pretty global, then. By the way, two thirds of the Russian ones come from mail.ru, so I guess that must be a pretty sweet property to own (even though the Russian word for e-mail is “мейл” which is only similar.)

Most of the mailbox parts of these addresses are personal names; a few are business names.

BlueToad_failure September 12, 2012 4:44 AM

@Ping-Che Chen: BlueToad was forced to publish their flaw, by David Schuetz who was otherwise going to publish his findings.

“The type of information I [David Schuetz] was able to access would have been very valuable to scammers and identity thieves, for instance. With mischievous entities like Antisec and Anonymous about, you can even envision a massive public dump of users’ private information, just for the hell of it. We just don’t know what the full impact might be.”

Leave a comment

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.