Entries Tagged "identification"

Page 1 of 24

Identifying People by Their Browsing Histories

Interesting paper: “Replication: Why We Still Can’t Browse in Peace: On the Uniqueness and Reidentifiability of Web Browsing Histories”:

We examine the threat to individuals’ privacy based on the feasibility of reidentifying users through distinctive profiles of their browsing history visible to websites and third parties. This work replicates and extends the 2012 paper Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns[48]. The original work demonstrated that browsing profiles are highly distinctive and stable. We reproduce those results and extend the original work to detail the privacy risk posed by the aggregation of browsing histories. Our dataset consists of two weeks of browsing data from ~52,000 Firefox users. Our work replicates the original paper’s core findings by identifying 48,919 distinct browsing profiles, of which 99% are unique. High uniqueness hold seven when histories are truncated to just 100 top sites. We then find that for users who visited 50 or more distinct domains in the two-week data collection period, ~50% can be reidentified using the top 10k sites. Reidentifiability rose to over 80% for users that browsed 150 or more distinct domains. Finally, we observe numerous third parties pervasive enough to gather web histories sufficient to leverage browsing history as an identifier.

One of the authors of the original study comments on the replication.

Posted on August 25, 2020 at 6:28 AMView Comments

Yet Another Biometric: Bioacoustic Signatures

Sound waves through the body are unique enough to be a biometric:

“Modeling allowed us to infer what structures or material features of the human body actually differentiated people,” explains Joo Yong Sim, one of the ETRI researchers who conducted the study. “For example, we could see how the structure, size, and weight of the bones, as well as the stiffness of the joints, affect the bioacoustics spectrum.”

[…]

Notably, the researchers were concerned that the accuracy of this approach could diminish with time, since the human body constantly changes its cells, matrices, and fluid content. To account for this, they acquired the acoustic data of participants at three separate intervals, each 30 days apart.

“We were very surprised that people’s bioacoustics spectral pattern maintained well over time, despite the concern that the pattern would change greatly,” says Sim. “These results suggest that the bioacoustics signature reflects more anatomical features than changes in water, body temperature, or biomolecule concentration in blood that change from day to day.”

It’s not great. A 97% accuracy is worse than fingerprints and iris scans, and while they were able to reproduce the biometric in a month it almost certainly changes as we age, gain and lose weight, and so on. Still, interesting.

Posted on August 21, 2020 at 6:03 AMView Comments

Identifying a Person Based on a Photo, LinkedIn and Etsy Profiles, and Other Internet Bread Crumbs

Interesting story of how the police can identify someone by following the evidence chain from website to website.

According to filings in Blumenthal’s case, FBI agents had little more to go on when they started their investigation than the news helicopter footage of the woman setting the police car ablaze as it was broadcast live May 30.

It showed the woman, in flame-retardant gloves, grabbing a burning piece of a police barricade that had already been used to set one squad car on fire and tossing it into the police SUV parked nearby. Within seconds, that car was also engulfed in flames.

Investigators discovered other images depicting the same scene on Instagram and the video sharing website Vimeo. Those allowed agents to zoom in and identify a stylized tattoo of a peace sign on the woman’s right forearm.

Scouring other images ­– including a cache of roughly 500 photos of the Philly protest shared by an amateur photographer ­– agents found shots of a woman with the same tattoo that gave a clear depiction of the slogan on her T-shirt.

[…]

That shirt, agents said, was found to have been sold only in one location: a shop on Etsy, the online marketplace for crafters, purveyors of custom-made clothing and jewelry, and other collectibles….

The top review on her page, dated just six days before the protest, was from a user identifying herself as “Xx Mv,” who listed her location as Philadelphia and her username as “alleycatlore.”

A Google search of that handle led agents to an account on Poshmark, the mobile fashion marketplace, with a user handle “lore-elisabeth.” And subsequent searches for that name turned up Blumenthal’s LinkedIn profile, where she identifies herself as a graduate of William Penn Charter School and several yoga and massage therapy training centers.

From there, they located Blumenthal’s Jenkintown massage studio and its website, which featured videos demonstrating her at work. On her forearm, agents discovered, was the same distinctive tattoo that investigators first identified on the arsonist in the original TV video.

The obvious moral isn’t a new one: don’t have a distinctive tattoo. But more interesting is how different pieces of evidence can be strung together in order to identify someone. This particular chain was put together manually, but expect machine learning techniques to be able to do this sort of thing automatically — and for organizations like the NSA to implement them on a broad scale.

Another article did a more detailed analysis, and concludes that the Etsy review was the linchpin.

Note to commenters: political commentary on the protesters or protests will be deleted. There are many other forums on the Internet to discuss that.

Posted on June 22, 2020 at 7:35 AMView Comments

Used Tesla Components Contain Personal Information

Used Tesla components, sold on eBay, still contain personal information, even after a factory reset.

This is a decades-old problem. It’s a problem with used hard drives. It’s a problem with used photocopiers and printers. It will be a problem with IoT devices. It’ll be a problem with everything, until we decide that data deletion is a priority.

EDITED TO ADD (6/20): These computes were not factory reset. Apparently, he data was intentionally left on the computer so that the technicians could transfer it when upgrading the computer. It’s still bad, but a factory reset does work.

Posted on May 8, 2020 at 9:46 AMView Comments

Me on COVID-19 Contact Tracing Apps

I was quoted in BuzzFeed:

“My problem with contact tracing apps is that they have absolutely no value,” Bruce Schneier, a privacy expert and fellow at the Berkman Klein Center for Internet & Society at Harvard University, told BuzzFeed News. “I’m not even talking about the privacy concerns, I mean the efficacy. Does anybody think this will do something useful? … This is just something governments want to do for the hell of it. To me, it’s just techies doing techie things because they don’t know what else to do.”

I haven’t blogged about this because I thought it was obvious. But from the tweets and emails I have received, it seems not.

This is a classic identification problem, and efficacy depends on two things: false positives and false negatives.

  • False positives: Any app will have a precise definition of a contact: let’s say it’s less than six feet for more than ten minutes. The false positive rate is the percentage of contacts that don’t result in transmissions. This will be because of several reasons. One, the app’s location and proximity systems — based on GPS and Bluetooth — just aren’t accurate enough to capture every contact. Two, the app won’t be aware of any extenuating circumstances, like walls or partitions. And three, not every contact results in transmission; the disease has some transmission rate that’s less than 100% (and I don’t know what that is).
  • False negatives: This is the rate the app fails to register a contact when an infection occurs. This also will be because of several reasons. One, errors in the app’s location and proximity systems. Two, transmissions that occur from people who don’t have the app (even Singapore didn’t get above a 20% adoption rate for the app). And three, not every transmission is a result of that precisely defined contact — the virus sometimes travels further.

Assume you take the app out grocery shopping with you and it subsequently alerts you of a contact. What should you do? It’s not accurate enough for you to quarantine yourself for two weeks. And without ubiquitous, cheap, fast, and accurate testing, you can’t confirm the app’s diagnosis. So the alert is useless.

Similarly, assume you take the app out grocery shopping and it doesn’t alert you of any contact. Are you in the clear? No, you’re not. You actually have no idea if you’ve been infected.

The end result is an app that doesn’t work. People will post their bad experiences on social media, and people will read those posts and realize that the app is not to be trusted. That loss of trust is even worse than having no app at all.

It has nothing to do with privacy concerns. The idea that contact tracing can be done with an app, and not human health professionals, is just plain dumb.

EDITED TO ADD: This Brookings essay makes much the same point.

EDITED TO ADD: This post has been translated into Spanish.

Posted on May 1, 2020 at 6:22 AMView Comments

Modern Mass Surveillance: Identify, Correlate, Discriminate

Communities across the United States are starting to ban facial recognition technologies. In May of last year, San Francisco banned facial recognition; the neighboring city of Oakland soon followed, as did Somerville and Brookline in Massachusetts (a statewide ban may follow). In December, San Diego suspended a facial recognition program in advance of a new statewide law, which declared it illegal, coming into effect. Forty major music festivals pledged not to use the technology, and activists are calling for a nationwide ban. Many Democratic presidential candidates support at least a partial ban on the technology.

These efforts are well-intentioned, but facial recognition bans are the wrong way to fight against modern surveillance. Focusing on one particular identification method misconstrues the nature of the surveillance society we’re in the process of building. Ubiquitous mass surveillance is increasingly the norm. In countries like China, a surveillance infrastructure is being built by the government for social control. In countries like the United States, it’s being built by corporations in order to influence our buying behavior, and is incidentally used by the government.

In all cases, modern mass surveillance has three broad components: identification, correlation and discrimination. Let’s take them in turn.

Facial recognition is a technology that can be used to identify people without their knowledge or consent. It relies on the prevalence of cameras, which are becoming both more powerful and smaller, and machine learning technologies that can match the output of these cameras with images from a database of existing photos.

But that’s just one identification technology among many. People can be identified at a distance by their heartbeat or by their gait, using a laser-based system. Cameras are so good that they can read fingerprints and iris patterns from meters away. And even without any of these technologies, we can always be identified because our smartphones broadcast unique numbers called MAC addresses. Other things identify us as well: our phone numbers, our credit card numbers, the license plates on our cars. China, for example, uses multiple identification technologies to support its surveillance state.

Once we are identified, the data about who we are and what we are doing can be correlated with other data collected at other times. This might be movement data, which can be used to “follow” us as we move throughout our day. It can be purchasing data, Internet browsing data, or data about who we talk to via email or text. It might be data about our income, ethnicity, lifestyle, profession and interests. There is an entire industry of data brokers who make a living analyzing and augmenting data about who we are ­– using surveillance data collected by all sorts of companies and then sold without our knowledge or consent.

There is a huge ­– and almost entirely unregulated ­– data broker industry in the United States that trades on our information. This is how large Internet companies like Google and Facebook make their money. It’s not just that they know who we are, it’s that they correlate what they know about us to create profiles about who we are and what our interests are. This is why many companies buy license plate data from states. It’s also why companies like Google are buying health records, and part of the reason Google bought the company Fitbit, along with all of its data.

The whole purpose of this process is for companies –­ and governments ­– to treat individuals differently. We are shown different ads on the Internet and receive different offers for credit cards. Smart billboards display different advertisements based on who we are. In the future, we might be treated differently when we walk into a store, just as we currently are when we visit websites.

The point is that it doesn’t matter which technology is used to identify people. That there currently is no comprehensive database of heartbeats or gaits doesn’t make the technologies that gather them any less effective. And most of the time, it doesn’t matter if identification isn’t tied to a real name. What’s important is that we can be consistently identified over time. We might be completely anonymous in a system that uses unique cookies to track us as we browse the Internet, but the same process of correlation and discrimination still occurs. It’s the same with faces; we can be tracked as we move around a store or shopping mall, even if that tracking isn’t tied to a specific name. And that anonymity is fragile: If we ever order something online with a credit card, or purchase something with a credit card in a store, then suddenly our real names are attached to what was anonymous tracking information.

Regulating this system means addressing all three steps of the process. A ban on facial recognition won’t make any difference if, in response, surveillance systems switch to identifying people by smartphone MAC addresses. The problem is that we are being identified without our knowledge or consent, and society needs rules about when that is permissible.

Similarly, we need rules about how our data can be combined with other data, and then bought and sold without our knowledge or consent. The data broker industry is almost entirely unregulated; there’s only one law ­– passed in Vermont in 2018 ­– that requires data brokers to register and explain in broad terms what kind of data they collect. The large Internet surveillance companies like Facebook and Google collect dossiers on us are more detailed than those of any police state of the previous century. Reasonable laws would prevent the worst of their abuses.

Finally, we need better rules about when and how it is permissible for companies to discriminate. Discrimination based on protected characteristics like race and gender is already illegal, but those rules are ineffectual against the current technologies of surveillance and control. When people can be identified and their data correlated at a speed and scale previously unseen, we need new rules.

Today, facial recognition technologies are receiving the brunt of the tech backlash, but focusing on them misses the point. We need to have a serious conversation about all the technologies of identification, correlation and discrimination, and decide how much we as a society want to be spied on by governments and corporations — and what sorts of influence we want them to have over our lives.

This essay previously appeared in the New York Times.

EDITED TO ADD: Rereading this post-publication, I see that it comes off as overly critical of those who are doing activism in this space. Writing the piece, I wasn’t thinking about political tactics. I was thinking about the technologies that support surveillance capitalism, and law enforcement’s usage of that corporate platform. Of course it makes sense to focus on face recognition in the short term. It’s something that’s easy to explain, viscerally creepy, and obviously actionable. It also makes sense to focus specifically on law enforcement’s use of the technology; there are clear civil and constitutional rights issues. The fact that law enforcement is so deeply involved in the technology’s marketing feels wrong. And the technology is currently being deployed in Hong Kong against political protesters. It’s why the issue has momentum, and why we’ve gotten the small wins we’ve had. (The EU is considering a five-year ban on face recognition technologies.) Those wins build momentum, which lead to more wins. I should have been kinder to those in the trenches.

If you want to help, sign the petition from Public Voice calling on a moratorium on facial recognition technology for mass surveillance. Or write to your US congressperson and demand similar action. There’s more information from EFF and EPIC.

EDITED TO ADD (3/16): This essay has been translated into Spanish.

Posted on January 27, 2020 at 12:21 PMView Comments

Bypassing Apple FaceID's Liveness Detection Feature

Apple’s FaceID has a liveness detection feature, which prevents someone from unlocking a victim’s phone by putting it in front of his face while he’s sleeping. That feature has been hacked:

Researchers on Wednesday during Black Hat USA 2019 demonstrated an attack that allowed them to bypass a victim’s FaceID and log into their phone simply by putting a pair of modified glasses on their face. By merely placing tape carefully over the lenses of a pair glasses and placing them on the victim’s face the researchers demonstrated how they could bypass Apple’s FaceID in a specific scenario. The attack itself is difficult, given the bad actor would need to figure out how to put the glasses on an unconscious victim without waking them up.

Posted on August 15, 2019 at 6:19 AMView Comments

Cardiac Biometric

MIT Technology Review is reporting about an infrared laser device that can identify people by their unique cardiac signature at a distance:

A new device, developed for the Pentagon after US Special Forces requested it, can identify people without seeing their face: instead it detects their unique cardiac signature with an infrared laser. While it works at 200 meters (219 yards), longer distances could be possible with a better laser. “I don’t want to say you could do it from space,” says Steward Remaly, of the Pentagon’s Combatting Terrorism Technical Support Office, “but longer ranges should be possible.”

Contact infrared sensors are often used to automatically record a patient’s pulse. They work by detecting the changes in reflection of infrared light caused by blood flow. By contrast, the new device, called Jetson, uses a technique known as laser vibrometry to detect the surface movement caused by the heartbeat. This works though typical clothing like a shirt and a jacket (though not thicker clothing such as a winter coat).

[…]

Remaly’s team then developed algorithms capable of extracting a cardiac signature from the laser signals. He claims that Jetson can achieve over 95% accuracy under good conditions, and this might be further improved. In practice, it’s likely that Jetson would be used alongside facial recognition or other identification methods.

Wenyao Xu of the State University of New York at Buffalo has also developed a remote cardiac sensor, although it works only up to 20 meters away and uses radar. He believes the cardiac approach is far more robust than facial recognition. “Compared with face, cardiac biometrics are more stable and can reach more than 98% accuracy,” he says.

I have my usual questions about false positives vs false negatives, how stable the biometric is over time, and whether it works better or worse against particular sub-populations. But interesting nonetheless.

Posted on July 8, 2019 at 12:38 PMView Comments

Fingerprinting iPhones

This clever attack allows someone to uniquely identify a phone when you visit a website, based on data from the accelerometer, gyroscope, and magnetometer sensors.

We have developed a new type of fingerprinting attack, the calibration fingerprinting attack. Our attack uses data gathered from the accelerometer, gyroscope and magnetometer sensors found in smartphones to construct a globally unique fingerprint. Overall, our attack has the following advantages:

  • The attack can be launched by any website you visit or any app you use on a vulnerable device without requiring any explicit confirmation or consent from you.
  • The attack takes less than one second to generate a fingerprint.
  • The attack can generate a globally unique fingerprint for iOS devices.
  • The calibration fingerprint never changes, even after a factory reset.
  • The attack provides an effective means to track you as you browse across the web and move between apps on your phone.

* Following our disclosure, Apple has patched this vulnerability in iOS 12.2.

Research paper.

Posted on May 22, 2019 at 6:24 AMView Comments

1 2 3 24

Sidebar photo of Bruce Schneier by Joe MacInnis.