Entries Tagged "identification"

Page 6 of 27

De-Anonymizing Browser History Using Social-Network Data

Interesting research: “De-anonymizing Web Browsing Data with Social Networks“:

Abstract: Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show—theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one’s feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user’s social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world effectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on sufficiently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largest scale demonstrated de-anonymization to date.

Posted on February 10, 2017 at 8:25 AMView Comments

Heartbeat as Biometric Password

There’s research in using a heartbeat as a biometric password. No details in the article. My guess is that there isn’t nearly enough entropy in the reproducible biometric, but I might be surprised. The article’s suggestion to use it as a password for health records seems especially problematic. “I’m sorry, but we can’t access the patient’s health records because he’s having a heart attack.”

I wrote about this before here.

Posted on January 19, 2017 at 6:22 AMView Comments

Firefox Removing Battery Status API

Firefox is removing the battery status API, citing privacy concerns. Here’s the paper that described those concerns:

Abstract. We highlight privacy risks associated with the HTML5 Battery Status API. We put special focus on its implementation in the Firefox browser. Our study shows that websites can discover the capacity of users’ batteries by exploiting the high precision readouts provided by Firefox on Linux. The capacity of the battery, as well as its level, expose a fingerprintable surface that can be used to track web users in short time intervals. Our analysis shows that the risk is much higher for old or used batteries with reduced capacities, as the battery capacity may potentially serve as a tracking identifier. The fingerprintable surface of the API could be drastically reduced without any loss in the API’s functionality by reducing the precision of the readings. We propose minor modifications to Battery Status API and its implementation in the Firefox browser to address the privacy issues presented in the study. Our bug report for Firefox was accepted and a fix is deployed.

W3C is updating the spec. Here’s a battery tracker found in the wild.

Posted on November 7, 2016 at 12:59 PMView Comments

Using Neural Networks to Identify Blurred Faces

Neural networks are good at identifying faces, even if they’re blurry:

In a paper released earlier this month, researchers at UT Austin and Cornell University demonstrate that faces and objects obscured by blurring, pixelation, and a recently-proposed privacy system called P3 can be successfully identified by a neural network trained on image datasets­—in some cases at a more consistent rate than humans.

“We argue that humans may no longer be the ‘gold standard’ for extracting information from visual data,” the researchers write. “Recent advances in machine learning based on artificial neural networks have led to dramatic improvements in the state of the art for automated image recognition. Trained machine learning models now outperform humans on tasks such as object recognition and determining the geographic location of an image.”

Research paper

Posted on September 27, 2016 at 9:39 AMView Comments

Using Wi-Fi Signals to Identify People by Body Shape

Another paper on using Wi-Fi for surveillance. This one is on identifying people by their body shape. “FreeSense:Indoor Human Identification with WiFi Signals“:

Abstract: Human identification plays an important role in human-computer interaction. There have been numerous methods proposed for human identification (e.g., face recognition, gait recognition, fingerprint identification, etc.). While these methods could be very useful under different conditions, they also suffer from certain shortcomings (e.g., user privacy, sensing coverage range). In this paper, we propose a novel approach for human identification, which leverages WIFI signals to enable non-intrusive human identification in domestic environments. It is based on the observation that each person has specific influence patterns to the surrounding WIFI signal while moving indoors, regarding their body shape characteristics and motion patterns. The influence can be captured by the Channel State Information (CSI) time series of WIFI. Specifically, a combination of Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT) and Dynamic Time Warping (DTW) techniques is used for CSI waveform-based human identification. We implemented the system in a 6m*5m smart home environment and recruited 9 users for data collection and evaluation. Experimental results indicate that the identification accuracy is about 88.9% to 94.5% when the candidate user set changes from 6 to 2, showing that the proposed human identification method is effective in domestic environments.

EDITED TO ADD (9/13): Related paper.

Posted on August 30, 2016 at 12:57 PMView Comments

Anonymization and the Law

Interesting paper: “Anonymization and Risk,” by Ira S. Rubinstein and Woodrow Hartzog:

Abstract: Perfect anonymization of data sets has failed. But the process of protecting data subjects in shared information remains integral to privacy practice and policy. While the deidentification debate has been vigorous and productive, there is no clear direction for policy. As a result, the law has been slow to adapt a holistic approach to protecting data subjects when data sets are released to others. Currently, the law is focused on whether an individual can be identified within a given set. We argue that the better locus of data release policy is on the process of minimizing the risk of reidentification and sensitive attribute disclosure. Process-based data release policy, which resembles the law of data security, will help us move past the limitations of focusing on whether data sets have been “anonymized.” It draws upon different tactics to protect the privacy of data subjects, including accurate deidentification rhetoric, contracts prohibiting reidentification and sensitive attribute disclosure, data enclaves, and query-based strategies to match required protections with the level of risk. By focusing on process, data release policy can better balance privacy and utility where nearly all data exchanges carry some risk.

Posted on July 11, 2016 at 6:31 AMView Comments

The Fallibility of DNA Evidence

This is a good summary article on the fallibility of DNA evidence. Most interesting to me are the parts on the proprietary algorithms used in DNA matching:

William Thompson points out that Perlin has declined to make public the algorithm that drives the program. “You do have a black-box situation happening here,” Thompson told me. “The data go in, and out comes the solution, and we’re not fully informed of what happened in between.”

Last year, at a murder trial in Pennsylvania where TrueAllele evidence had been introduced, defense attorneys demanded that Perlin turn over the source code for his software, noting that “without it, [the defendant] will be unable to determine if TrueAllele does what Dr. Perlin claims it does.” The judge denied the request.

[…]

When I interviewed Perlin at Cybergenetics headquarters, I raised the matter of transparency. He was visibly annoyed. He noted that he’d published detailed papers on the theory behind TrueAllele, and filed patent applications, too: “We have disclosed not the trade secrets of the source code or the engineering details, but the basic math.”

It’s the same problem as any biometric: we need to know the rates of both false positives and false negatives. And if these algorithms are being used to determine guilt, we have a right to examine them.

EDITED TO ADD (6/13): Three more articles.

Posted on May 31, 2016 at 1:04 PMView Comments

Identifying People from their Driving Patterns

People can be identified from their “driver fingerprint“:

…a group of researchers from the University of Washington and the University of California at San Diego found that they could “fingerprint” drivers based only on data they collected from internal computer network of the vehicle their test subjects were driving, what’s known as a car’s CAN bus. In fact, they found that the data collected from a car’s brake pedal alone could let them correctly distinguish the correct driver out of 15 individuals about nine times out of ten, after just 15 minutes of driving. With 90 minutes driving data or monitoring more car components, they could pick out the correct driver fully 100 percent of the time.

The paper: “Automobile Driver Fingerprinting,” by Miro Enev, Alex Takahuwa, Karl Koscher, and Tadayoshi Kohno.

Abstract: Today’s automobiles leverage powerful sensors and embedded computers to optimize efficiency, safety, and driver engagement. However the complexity of possible inferences using in-car sensor data is not well understood. While we do not know of attempts by automotive manufacturers or makers of after-market components (like insurance dongles) to violate privacy, a key question we ask is: could they (or their collection and later accidental leaks of data) violate a driver’s privacy? In the present study, we experimentally investigate the potential to identify individuals using sensor data snippets of their natural driving behavior. More specifically we record the in-vehicle sensor data on the controller area-network (CAN) of a typical modern vehicle (popular 2009 sedan) as each of 15 participants (a) performed a series of maneuvers in an isolated parking lot, and (b) drove the vehicle in traffic along a defined ~50 mile loop through the Seattle metropolitan area. We then split the data into training and testing sets, train an ensemble of classifiers, and evaluate identification accuracy of test data queries by looking at the highest voted candidate when considering all possible one-vs-one comparisons. Our results indicate that, at least among small sets, drivers are indeed distinguishable using only in car sensors. In particular, we find that it is possible to differentiate our 15 drivers with 100% accuracy when training with all of the available sensors using 90% of driving data from each person. Furthermore, it is possible to reach high identification rates using less than 8 minutes of training data. When more training data is available it is possible to reach very high identification using only a single sensor (e.g., the brake pedal). As an extension, we also demonstrate the feasibility of performing driver identification across multiple days of data collection.

Posted on May 30, 2016 at 10:10 AMView Comments

1 4 5 6 7 8 27

Sidebar photo of Bruce Schneier by Joe MacInnis.