Entries Tagged "social media"

Page 12 of 14

Predicting Characteristics of People by the Company they Keep

Turns out “gaydar” can be automated:

Using data from the social network Facebook, they made a striking discovery: just by looking at a person’s online friends, they could predict whether the person was gay. They did this with a software program that looked at the gender and sexuality of a person’s friends and, using statistical analysis, made a prediction. The two students had no way of checking all of their predictions, but based on their own knowledge outside the Facebook world, their computer program appeared quite accurate for men, they said. People may be effectively “outing” themselves just by the virtual company they keep.

This sort of thing can be generalized:

The work has not been published in a scientific journal, but it provides a provocative warning note about privacy. Discussions of privacy often focus on how to best keep things secret, whether it is making sure online financial transactions are secure from intruders, or telling people to think twice before opening their lives too widely on blogs or online profiles. But this work shows that people may reveal information about themselves in another way, and without knowing they are making it public. Who we are can be revealed by, and even defined by, who our friends are: if all your friends are over 45, you’re probably not a teenager; if they all belong to a particular religion, it’s a decent bet that you do, too. The ability to connect with other people who have something in common is part of the power of social networks, but also a possible pitfall. If our friends reveal who we are, that challenges a conception of privacy built on the notion that there are things we tell, and things we don’t.

EDITED TO ADD (9/29): Better information from the MIT Newspaper.

Posted on September 29, 2009 at 7:13 AMView Comments

File Deletion

File deletion is all about control. This used to not be an issue. Your data was on your computer, and you decided when and how to delete a file. You could use the delete function if you didn’t care about whether the file could be recovered or not, and a file erase program—I use BCWipe for Windows—if you wanted to ensure no one could ever recover the file.

As we move more of our data onto cloud computing platforms such as Gmail and Facebook, and closed proprietary platforms such as the Kindle and the iPhone, deleting data is much harder.

You have to trust that these companies will delete your data when you ask them to, but they’re generally not interested in doing so. Sites like these are more likely to make your data inaccessible than they are to physically delete it. Facebook is a known culprit: actually deleting your data from its servers requires a complicated procedure that may or may not work. And even if you do manage to delete your data, copies are certain to remain in the companies’ backup systems. Gmail explicitly says this in its privacy notice.

Online backups, SMS messages, photos on photo sharing sites, smartphone applications that store your data in the network: you have no idea what really happens when you delete pieces of data or your entire account, because you’re not in control of the computers that are storing the data.

This notion of control also explains how Amazon was able to delete a book that people had previously purchased on their Kindle e-book readers. The legalities are debatable, but Amazon had the technical ability to delete the file because it controls all Kindles. It has designed the Kindle so that it determines when to update the software, whether people are allowed to buy Kindle books, and when to turn off people’s Kindles entirely.

Vanish is a research project by Roxana Geambasu and colleagues at the University of Washington. They designed a prototype system that automatically deletes data after a set time interval. So you can send an email, create a Google Doc, post an update to Facebook, or upload a photo to Flickr, all designed to disappear after a set period of time. And after it disappears, no one—not anyone who downloaded the data, not the site that hosted the data, not anyone who intercepted the data in transit, not even you—will be able to read it. If the police arrive at Facebook or Google or Flickr with a warrant, they won’t be able to read it.

The details are complicated, but Vanish breaks the data’s decryption key into a bunch of pieces and scatters them around the web using a peer-to-peer network. Then it uses the natural turnover in these networks—machines constantly join and leave—to make the data disappear. Unlike previous programs that supported file deletion, this one doesn’t require you to trust any company, organisation, or website. It just happens.

Of course, Vanish doesn’t prevent the recipient of an email or the reader of a Facebook page from copying the data and pasting it into another file, just as Kindle’s deletion feature doesn’t prevent people from copying a book’s files and saving them on their computers. Vanish is just a prototype at this point, and it only works if all the people who read your Facebook entries or view your Flickr pictures have it installed on their computers as well; but it’s a good demonstration of how control affects file deletion. And while it’s a step in the right direction, it’s also new and therefore deserves further security analysis before being adopted on a wide scale.

We’ve lost the control of data on some of the computers we own, and we’ve lost control of our data in the cloud. We’re not going to stop using Facebook and Twitter just because they’re not going to delete our data when we ask them to, and we’re not going to stop using Kindles and iPhones because they may delete our data when we don’t want them to. But we need to take back control of data in the cloud, and projects like Vanish show us how we can.

Now we need something that will protect our data when a large corporation decides to delete it.

This essay originally appeared in The Guardian.

EDITED TO ADD (9/30): Vanish has been broken, paper here.

Posted on September 10, 2009 at 6:08 AMView Comments

Privacy Salience and Social Networking Sites

Reassuring people about privacy makes them more, not less, concerned. It’s called “privacy salience,” and Leslie John, Alessandro Acquisti, and George Loewenstein—all at Carnegie Mellon University—demonstrated this in a series of clever experiments. In one, subjects completed an online survey consisting of a series of questions about their academic behavior—”Have you ever cheated on an exam?” for example. Half of the subjects were first required to sign a consent warning—designed to make privacy concerns more salient—while the other half did not. Also, subjects were randomly assigned to receive either a privacy confidentiality assurance, or no such assurance. When the privacy concern was made salient (through the consent warning), people reacted negatively to the subsequent confidentiality assurance and were less likely to reveal personal information.

In another experiment, subjects completed an online survey where they were asked a series of personal questions, such as “Have you ever tried cocaine?” Half of the subjects completed a frivolous-looking survey—”How BAD are U??”—with a picture of a cute devil. The other half completed the same survey with the title “Carnegie Mellon University Survey of Ethical Standards,” complete with a university seal and official privacy assurances. The results showed that people who were reminded about privacy were less likely to reveal personal information than those who were not.

Privacy salience does a lot to explain social networking sites and their attitudes towards privacy. From a business perspective, social networking sites don’t want their members to exercise their privacy rights very much. They want members to be comfortable disclosing a lot of data about themselves.

Joseph Bonneau and Soeren Preibusch of Cambridge University have been studying privacy on 45 popular social networking sites around the world. (You may not have realized that there are 45 popular social networking sites around the world.) They found that privacy settings were often confusing and hard to access; Facebook, with its 61 privacy settings, is the worst. To understand some of the settings, they had to create accounts with different settings so they could compare the results. Privacy tends to increase with the age and popularity of a site. General-use sites tend to have more privacy features than niche sites.

But their most interesting finding was that sites consistently hide any mentions of privacy. Their splash pages talk about connecting with friends, meeting new people, sharing pictures: the benefits of disclosing personal data.

These sites do talk about privacy, but only on hard-to-find privacy policy pages. There, the sites give strong reassurances about their privacy controls and the safety of data members choose to disclose on the site. There, the sites display third-party privacy seals and other icons designed to assuage any fears members have.

It’s the Carnegie Mellon experimental result in the real world. Users care about privacy, but don’t really think about it day to day. The social networking sites don’t want to remind users about privacy, even if they talk about it positively, because any reminder will result in users remembering their privacy fears and becoming more cautious about sharing personal data. But the sites also need to reassure those “privacy fundamentalists” for whom privacy is always salient, so they have very strong pro-privacy rhetoric for those who take the time to search them out. The two different marketing messages are for two different audiences.

Social networking sites are improving their privacy controls as a result of public pressure. At the same time, there is a counterbalancing business pressure to decrease privacy; watch what’s going on right now on Facebook, for example. Naively, we should expect companies to make their privacy policies clear to allow customers to make an informed choice. But the marketing need to reduce privacy salience will frustrate market solutions to improve privacy; sites would much rather obfuscate the issue than compete on it as a feature.

This essay originally appeared in the Guardian.

Posted on July 16, 2009 at 6:05 AMView Comments

Did a Public Twitter Post Lead to a Burglary?

No evidence one way or the other:

Like a lot of people who use social media, Israel Hyman and his wife Noell went on Twitter to share real-time details of a recent trip. Their posts said they were “preparing to head out of town,” that they had “another 10 hours of driving ahead,” and that they “made it to Kansas City.”

While they were on the road, their home in Mesa, Ariz., was burglarized. Hyman has an online video business called IzzyVideo.com, with 2,000 followers on Twitter. He thinks his Twitter updates tipped the burglars off.

“My wife thinks it could be a random thing, but I just have my suspicions,” he said. “They didn’t take any of our normal consumer electronics.” They took his video editing equipment.

I’m not saying that there isn’t a connection, but people have a propensity for seeing these sorts of connections.

Posted on June 15, 2009 at 2:26 PMView Comments

Second SHB Workshop Liveblogging (8)

The penultimate session of the conference was “Privacy,” moderated by Tyler Moore.

Alessandro Acquisti, Carnegie Mellon University (suggested reading: What Can Behavioral Economics Teach Us About Privacy?; Privacy in Electronic Commerce and the Economics of Immediate Gratification), presented research on how people value their privacy. He started by listing a variety of cognitive biases that affect privacy decisions: illusion of control, overconfidence, optimism bias, endowment effect, and so on. He discussed two experiments. The first demonstrated a “herding effect”: if a subject believes that others reveal sensitive behavior, the subject is more likely to also reveal sensitive behavior. The second examined the “frog effect”: do privacy intrusions alert or desensitize people to revealing personal information? What he found is that people tend to set their privacy level at the beginning of a survey, and don’t respond well to being asked easy questions at first and then sensitive questions at the end. In the discussion, Joe Bonneau asked him about the notion that people’s privacy protections tend to ratchet up over time; he didn’t have conclusive evidence, but gave several possible explanations for the phenomenon.

Adam Joinson, University of Bath (suggested reading: Privacy, Trust and Self-Disclosure Online; Privacy concerns and privacy actions), also studies how people value their privacy. He talked about expressive privacy—privacy that allows people to express themselves and form interpersonal relationships. His research showed that differences between how people use Facebook in different countries depend on how much people trust Facebook as a company, rather than how much people trust other Facebook users. Another study looked at posts from Secret Tweet and Twitter. He found 16 markers that allowed him to automatically determine which tweets contain sensitive personal information and which do not, with high probability. Then he tried to determine if people with large Twitter followings post fewer secrets than people who are only twittering to a few people. He found absolutely no difference.

Peter Neumann, SRI (suggested reading: Holistic systems; Risks; Identity and Trust in Context), talked about lack of medical privacy (too many people have access to your data), about voting (the privacy problem makes the voting problem a lot harder, and the end-to-end voting security/privacy problem is much harder than just securing voting machines), and privacy in China (the government is requiring all computers sold in China to be sold with software allowing them to eavesdrop on the users). Any would-be solution needs to reflect the ubiquity of the threat. When we design systems, we need to anticipate what the privacy problems will be. Privacy problems are everywhere you look, and ordinary people have no idea of the depth of the problem.

Eric Johnson, Dartmouth College (suggested reading: Access Flexibility with Escalation and Audit; Security through Information Risk Management), studies the information access problem from a business perspective. He’s been doing field studies in companies like retail banks and investment banks, and found that role-based access control fails because companies can’t determine who has what role. Even worse, roles change quickly, especially in large complex organizations. For example, one business group of 3000 people experiences 1000 role changes within three months. The result is that organizations do access control badly, either over-entitling or under-entitling people. But since getting the job done is the most important thing, organizations tend to over-entitle: give people more access than they need. His current work is to find the right set of incentives and controls to set access more properly. The challege is to do this without making people risk averse. In the discussion, he agreed that a perfect access control system is not possible, and that organizations should probably allow a certain amount of access control violations—similar to the idea of posting a 55 mph speed limit but not ticketing people unless they go over 70 mph.

Christine Jolls, Yale Law School (suggested reading: Rationality and Consent in Privacy Law, Employee Privacy), made the point that people regularly share their most private information with their intimates—so privacy is not about secrecy, it’s more about control. There are moments when people make pretty big privacy decisions. For example, they grant employers the rights to monitor their e-mail, or test their urine without notice. In general, courts hold that blanket signing away of privacy rights—”you can test my urine on any day in the future”—are not valid, but immediate signing away of privacy of privacy rights—”you can test my urine today”—are. Jolls believes that this is reasonable for several reasons, such as optimism bias and an overfocus on the present at the expense of the future. Without realizing it, the courts have implemented the system that behavioral economics would find optimal. During the discussion, she talked about how coercion figures into this; the U.S. legal system tends not to be concerned with it.

Andrew Adams, University of Reading (suggested reading: Regulating CCTV), also looks at attitudes of privacy on social networking services. His results are preliminary, and based on interviews with university students in Canada, Japan, and the UK, and are very concordant with what danah boyd and Joe Bonneau said earlier. From the UK: People join social networking sites to increase their level of interaction with people they already know in real life. Revealing personal information is okay, but revealing too much is bad. Even more interestingly, it’s not okay to reveal more about others than they reveal themselves. From Japan: People are more open to making friends online. There’s more anonymity. It’s not okay to reveal information about others, but “the fault of this lies as much with the person whose data was revealed in not choosing friends wisely.” This victim responsibility is a common theme with other privacy and security elements in Japan. Data from Canada is still being compiled.

Great phrase: the “laundry belt”—close enough for students to go home on weekends with their laundry, but far enough away so they don’t feel as if their parents are looking over their shoulder—typically two hours by public transportation (in the UK).

Adam Shostack’s liveblogging is here. Ross Anderson’s liveblogging is in his blog post’s comments. Matt Blaze’s audio is here.

Posted on June 12, 2009 at 3:01 PMView Comments

Second SHB Workshop Liveblogging (6)

The first session of the morning was “Foundations,” which is kind of a catch-all for a variety of things that didn’t really fit anywhere else. Rachel Greenstadt moderated.

Terence Taylor, International Council for the Live Sciences (suggested video to watch: Darwinian Security; Natural Security), talked about the lessons evolution teaches about living with risk. Successful species didn’t survive by eliminating the risks of their environment, they survived by adaptation. Adaptation isn’t always what you think. For example, you could view the collapse of the Soviet Union as a failure to adapt, but you could also view it as successful adaptation. Risk is good. Risk is essential for the survival of a society, because risk-takers are the drivers of change. In the discussion phase, John Mueller pointed out a key difference between human and biological systems: humans tend to respond dramatically to anomalous events (the anthrax attacks), while biological systems respond to sustained change. And David Livingstone Smith asked about the difference between biological adaptation that affects the reproductive success of an organism’s genes, even at the expense of the organism, with security adaptation. (I recommend the book he edited: Natural Security: A Darwinian Approach to a Dangerous World.)

Andrew Odlyzko, University of Minnesota (suggested reading: Network Neutrality, Search Neutrality, and the Never-Ending Conflict between Efficiency and Fairness in Markets, Economics, Psychology, and Sociology of Security), discussed human-space vs. cyberspace. People cannot build secure systems—we know that—but people also cannot live with secure systems. We require a certain amount of flexibility in our systems. And finally, people don’t need secure systems. We survive with an astounding amount of insecurity in our world. The problem with cyberspace is that it was originally conceived as separate from the physical world, and that it could correct for the inadequacies of the physical world. Really, the two are intertwined, and that human space more often corrects for the inadequacies of cyberspace. Lessons: build messy systems, not clean ones; create a web of ties to other systems; create permanent records.

danah boyd, Microsoft Research (suggested reading: Taken Out of Context—American Teen Sociality in Networked Publics), does ethnographic studies of teens in cyberspace. Teens tend not to lie to their friends in cyberspace, but they lie to the system. Since an early age, they’ve been taught that they need to lie online to be safe. Teens regularly share their passwords: with their parents when forced, or with their best friend or significant other. This is a way of demonstrating trust. It’s part of the social protocol for this generation. In general, teens don’t use social media in the same way as adults do. And when they grow up, they won’t use social media in the same way as today’s adults do. Teens view privacy in terms of control, and take their cues about privacy from celebrities and how they use social media. And their sense of privacy is much more nuanced and complicated. In the discussion phase, danah wasn’t sure whether the younger generation would be more or less susceptible to Internet scams than the rest of us—they’re not nearly as technically savvy as we might think they are. “The only thing that saves teenagers is fear of their parents”; they try to lock them out, and lock others out in the process. Socio-economic status matters a lot, in ways that she is still trying to figure out. There are three different types of social networks: personal networks, articulated networks, and behavioral networks, and they’re different.

Mark Levine, Lancaster University (suggested reading: The Kindness of Crowds; Intra-group Regulation of Violence: Bystanders and the (De)-escalation of Violence), does social psychology. He argued against the common belief that groups are bad (mob violence, mass hysteria, peer group pressure). He collected data from UK CCTV cameras, searches for aggressive behavior, and studies when and how bystanders either help escalate or de-escalate the situations. Results: as groups get bigger, there is no increase of anti-social acts and a significant increase in pro-social acts. He has much more analysis and results, too complicated to summarize here. One key finding: when a third party intervenes in an aggressive interaction, it is much more likely to de-escalate. Basically, groups can act against violence. “When it comes to violence (and security), group processes are part of the solution—not part of the problem?”

Jeff MacKie-Mason, University of Michigan (suggested reading: Humans are smart devices, but not programmable; Security when people matter; A Social Mechanism for Supporting Home Computer Security), is an economist: “Security problems are incentive problems.” He discussed motivation, and how to design systems to take motivation into account. Humans are smart devices; they can’t be programmed, but they can be influenced through the sciences of motivational behavior: microeconomics, game theory, social psychology, psychodynamics, and personality psychology. He gave a couple of general examples of how these theories can inform security system design.

Joe Bonneau, Cambridge University, talked about social networks like Facebook, and privacy. People misunderstand why privacy and security is important in social networking sites like Facebook. People underestimate of what Facebook really is; it really is a reimplementation of the entire Internet. “Everything on the Internet is becoming social,” and that makes security different. Phishing is different, 419-style scams are different. Social context makes some scams easier; social networks are fun, noisy, and unpredictable. “People use social networking systems with their brain turned off.” But social context can be used to spot frauds and anomalies, and can be used to establish trust.

Three more sessions to go. (I am enjoying liveblogging the event. It’s helping me focus and pay closer attention.)

Adam Shostack’s liveblogging is here. Ross Anderson’s liveblogging is in his blog post’s comments. Matt Blaze’s audio is here.

Posted on June 12, 2009 at 9:54 AMView Comments

Fake Facts on Twitter

Clever hack:

Back during the debate for HR 1, I was amazed at how easily conservatives were willing to accept and repeat lies about spending in the stimulus package, even after those provisions had been debunked as fabrications. The $30 million for the salt marsh mouse is a perfect example, and Kagro X documented well over a dozen congressmen repeating the lie.

To test the limits of this phenomenon, I started a parody Twitter account last Thursday, which I called “InTheStimulus“, where all the tweets took the format “InTheStimulus is $x million for ______”. I went through the followers of Republican Twitter feeds and in turn followed them, all the way up to the limit of 2000. From people following me back, I was able to get 500 followers in less than a day, and 1000 by Sunday morning.

You can read through all the retweets and responses by looking at the Twitter search for “InTheStimulus“. For the most part, my first couple days of posts were believable, but unsourced lies:

  • $3 million for replacement tires for 1992-1995 Geo Metros.
  • $750,000 for an underground tunnel connecting a middle school and high school in North Carolina.
  • $4.7 million for a program supplying public television to K-8 classrooms.
  • $2.3 million for a museum dedicated to the electric bass guitar.

The Twitter InTheStimulus site appears to have been taken down.

There a several things going on here. First is confirmation bias, which is the tendency of people to believe things that reinforce their prior beliefs. But the second is the limited bandwidth of Twitter—140-character messages—that makes it very difficult to authenticate anything. Twitter is an ideal medium to inject fake facts into society for precisely this reason.

EDITED TO ADD (5/14): False Twitter rumors about Swine Flu.

Posted on April 24, 2009 at 6:29 AMView Comments

Social Networking Identity Theft Scams

Clever:

I’m going to tell you exactly how someone can trick you into thinking they’re your friend. Now, before you send me hate mail for revealing this deep, dark secret, let me assure you that the scammers, crooks, predators, stalkers and identity thieves are already aware of this trick. It works only because the public is not aware of it. If you’re scamming someone, here’s what you’d do:

Step 1: Request to be “friends” with a dozen strangers on MySpace. Let’s say half of them accept. Collect a list of all their friends.

Step 2: Go to Facebook and search for those six people. Let’s say you find four of them also on Facebook. Request to be their friends on Facebook. All accept because you’re already an established friend.

Step 3: Now compare the MySpace friends against the Facebook friends. Generate a list of people that are on MySpace but are not on Facebook. Grab the photos and profile data on those people from MySpace and use it to create false but convincing profiles on Facebook. Send “friend” requests to your victims on Facebook.

As a bonus, others who are friends of both your victims and your fake self will contact you to be friends and, of course, you’ll accept. In fact, Facebook itself will suggest you as a friend to those people.

(Think about the trust factor here. For these secondary victims, they not only feel they know you, but actually request “friend” status. They sought you out.)

Step 4: Now, you’re in business. You can ask things of these people that only friends dare ask.

Like what? Lend me $500. When are you going out of town? Etc.

The author has no evidence that anyone has actually done this, but certainly someone will do this sometime in the future.

We have seen attacks by people hijacking existing social networking accounts:

Rutberg was the victim of a new, targeted version of a very old scam—the “Nigerian,” or “419,” ploy. The first reports of such scams emerged back in November, part of a new trend in the computer underground—rather than sending out millions of spam messages in the hopes of trapping a tiny fractions of recipients, Web criminals are getting much more personal in their attacks, using social networking sites and other databases to make their story lines much more believable.

In Rutberg’s case, criminals managed to steal his Facebook login password, steal his Facebook identity, and change his page to make it appear he was in trouble. Next, the criminals sent e-mails to dozens of friends, begging them for help.

“Can you just get some money to us,” the imposter implored to one of Rutberg’s friends. “I tried Amex and it’s not going through. … I’ll refund you as soon as am back home. Let me know please.”

Posted on April 8, 2009 at 6:43 AMView Comments

Identifying People using Anonymous Social Networking Data

Interesting:

Computer scientists Arvind Narayanan and Dr Vitaly Shmatikov, from the University of Texas at Austin, developed the algorithm which turned the anonymous data back into names and addresses.

The data sets are usually stripped of personally identifiable information, such as names, before it is sold to marketing companies or researchers keen to plumb it for useful information.

Before now, it was thought sufficient to remove this data to make sure that the true identities of subjects could not be reconstructed.

The algorithm developed by the pair looks at relationships between all the members of a social network—not just the immediate friends that members of these sites connect to.

Social graphs from Twitter, Flickr and Live Journal were used in the research.

The pair found that one third of those who are on both Flickr and Twitter can be identified from the completely anonymous Twitter graph. This is despite the fact that the overlap of members between the two services is thought to be about 15%.

The researchers suggest that as social network sites become more heavily used, then people will find it increasingly difficult to maintain a veil of anonymity.

More details:

In “De-anonymizing social networks,” Narayanan and Shmatikov take an anonymous graph of the social relationships established through Twitter and find that they can actually identify many Twitter accounts based on an entirely different data source—in this case, Flickr.

One-third of users with accounts on both services could be identified on Twitter based on their Flickr connections, even when the Twitter social graph being used was completely anonymous. The point, say the authors, is that “anonymity is not sufficient for privacy when dealing with social networks,” since their scheme relies only on a social network’s topology to make the identification.

The issue is of more than academic interest, as social networks now routinely release such anonymous social graphs to advertisers and third-party apps, and government and academic researchers ask for such data to conduct research. But the data isn’t nearly as “anonymous” as those releasing it appear to think it is, and it can easily be cross-referenced to other data sets to expose user identities.

It’s not just about Twitter, either. Twitter was a proof of concept, but the idea extends to any sort of social network: phone call records, healthcare records, academic sociological datasets, etc.

Here’s the paper.

Posted on April 6, 2009 at 6:51 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.