Interesting research: “You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information,” by Beatrice Perez, Mirco Musolesi, and Gianluca Stringhini.
Abstract: Metadata are associated to most of the information we produce in our daily interactions and communication in the digital world. Yet, surprisingly, metadata are often still categorized as non-sensitive. Indeed, in the past, researchers and practitioners have mainly focused on the problem of the identification of a user from the content of a message.
In this paper, we use Twitter as a case study to quantify the uniqueness of the association between metadata and user identity and to understand the effectiveness of potential obfuscation strategies. More specifically, we analyze atomic fields in the metadata and systematically combine them in an effort to classify new tweets as belonging to an account using different machine learning algorithms of increasing complexity. We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. These results have strong implications in terms of the design of metadata obfuscation strategies, for example for data set release, not only for Twitter, but, more generally, for most social media platforms.
Posted on July 30, 2018 at 6:35 AM •
Interesting research: “The detection of faked identity using unexpected questions and mouse dynamics,” by Merulin Monaro, Luciano Gamberini, and Guiseppe Sartori.
Abstract: The detection of faked identities is a major problem in security. Current memory-detection techniques cannot be used as they require prior knowledge of the respondent’s true identity. Here, we report a novel technique for detecting faked identities based on the use of unexpected questions that may be used to check the respondent identity without any prior autobiographical information. While truth-tellers respond automatically to unexpected questions, liars have to “build” and verify their responses. This lack of automaticity is reflected in the mouse movements used to record the responses as well as in the number of errors. Responses to unexpected questions are compared to responses to expected and control questions (i.e., questions to which a liar also must respond truthfully). Parameters that encode mouse movement were analyzed using machine learning classifiers and the results indicate that the mouse trajectories and errors on unexpected questions efficiently distinguish liars from truth-tellers. Furthermore, we showed that liars may be identified also when they are responding truthfully. Unexpected questions combined with the analysis of mouse movement may efficiently spot participants with faked identities without the need for any prior information on the examinee.
Boing Boing post.
Posted on May 25, 2018 at 6:25 AM •
This acoustic technology identifies individuals by their ear shapes. No information about either false positives or false negatives.
Posted on April 23, 2018 at 7:48 AM •
Police in the UK were able to read a fingerprint from a photo of a hand:
Staff from the unit’s specialist imaging team were able to enhance a picture of a hand holding a number of tablets, which was taken from a mobile phone, before fingerprint experts were able to positively identify that the hand was that of Elliott Morris.
Speaking about the pioneering techniques used in the case, Dave Thomas, forensic operations manager at the Scientific Support Unit, added: “Specialist staff within the JSIU fully utilised their expert image-enhancing skills which enabled them to provide something that the unit’s fingerprint identification experts could work. Despite being provided with only a very small section of the fingerprint which was visible in the photograph, the team were able to successfully identify the individual.”
Posted on April 19, 2018 at 6:51 AM •
Interesting research: “‘Won’t Somebody Think of the Children?’ Examining COPPA Compliance at Scale“:
Abstract: We present a scalable dynamic analysis framework that allows for the automatic evaluation of the privacy behaviors of Android apps. We use our system to analyze mobile apps’ compliance with the Children’s Online Privacy Protection Act (COPPA), one of the few stringent privacy laws in the U.S. Based on our automated analysis of 5,855 of the most popular free children’s apps, we found that a majority are potentially in violation of COPPA, mainly due to their use of third-party SDKs. While many of these SDKs offer configuration options to respect COPPA by disabling tracking and behavioral advertising, our data suggest that a majority of apps either do not make use of these options or incorrectly propagate them across mediation SDKs. Worse, we observed that 19% of children’s apps collect identifiers or other personally identifiable information (PII) via SDKs whose terms of service outright prohibit their use in child-directed apps. Finally, we show that efforts by Google to limit tracking through the use of a resettable advertising ID have had little success: of the 3,454 apps that share the resettable ID with advertisers, 66% transmit other, non-resettable, persistent identifiers as well, negating any intended privacy-preserving properties of the advertising ID.
Posted on April 13, 2018 at 6:43 AM •
It’s routine for US police to unlock iPhones with the fingerprints of dead people. It seems only to work with recently dead people.
Posted on March 30, 2018 at 6:11 AM •
The security researchers at Princeton are postingYou may know that most websites have third-party analytics scripts that record which pages you visit and the searches you make. But lately, more and more sites use “session replay” scripts. These scripts record your keystrokes, mouse movements, and scrolling behavior, along with the entire contents of the pages you visit, and send them to third-party servers. Unlike typical analytics services that provide aggregate statistics, these scripts are intended for the recording and playback of individual browsing sessions, as if someone is looking over your shoulder.
The stated purpose of this data collection includes gathering insights into how users interact with websites and discovering broken or confusing pages. However the extent of data collected by these services far exceeds user expectations; text typed into forms is collected before the user submits the form, and precise mouse movements are saved, all without any visual indication to the user. This data can’t reasonably be expected to be kept anonymous. In fact, some companies allow publishers to explicitly link recordings to a user’s real identity.
The researchers will post more details on their blog; I’ll link to them when they’re published.
Posted on November 22, 2017 at 8:54 AM •
Two related stories:
PornHub is using machine learning algorithms to identify actors in different videos, so as to better index them. People are worried that it can really identify them, by linking their stage names to their real names.
Facebook somehow managed to link a sex worker’s clients under her fake name to her real profile.
Sometimes people have legitimate reasons for having two identities. That is becoming harder and harder.
Posted on October 13, 2017 at 6:57 AM •
In the wake of the Equifax break, I’ve heard calls to replace Social Security numbers. Steve Bellovin explains why this is hard.
Posted on October 5, 2017 at 3:22 PM •
Sidebar photo of Bruce Schneier by Joe MacInnis.