Page 69

Facebook’s Extensive Surveillance Network

Consumer Reports is reporting that Facebook has built a massive surveillance network:

Using a panel of 709 volunteers who shared archives of their Facebook data, Consumer Reports found that a total of 186,892 companies sent data about them to the social network. On average, each participant in the study had their data sent to Facebook by 2,230 companies. That number varied significantly, with some panelists’ data listing over 7,000 companies providing their data. The Markup helped Consumer Reports recruit participants for the study. Participants downloaded an archive of the previous three years of their data from their Facebook settings, then provided it to Consumer Reports.

This isn’t data about your use of Facebook. This data about your interactions with other companies, all of which is correlated and analyzed by Facebook. It constantly amazes me that we willingly allow these monopoly companies that kind of surveillance power.

Here’s the Consumer Reports study. It includes policy recommendations:

Many consumers will rightly be concerned about the extent to which their activity is tracked by Facebook and other companies, and may want to take action to counteract consistent surveillance. Based on our analysis of the sample data, consumers need interventions that will:

  • Reduce the overall amount of tracking.
  • Improve the ability for consumers to take advantage of their right to opt out under state privacy laws.
  • Empower social media platform users and researchers to review who and what exactly is being advertised on Facebook.
  • Improve the transparency of Facebook’s existing tools.

And then the report gives specifics.

Posted on February 1, 2024 at 7:06 AMView Comments

CFPB’s Proposed Data Rules

In October, the Consumer Financial Protection Bureau (CFPB) proposed a set of rules that if implemented would transform how financial institutions handle personal data about their customers. The rules put control of that data back in the hands of ordinary Americans, while at the same time undermining the data broker economy and increasing customer choice and competition. Beyond these economic effects, the rules have important data security benefits.

The CFPB’s rules align with a key security idea: the decoupling principle. By separating which companies see what parts of our data, and in what contexts, we can gain control over data about ourselves (improving privacy) and harden cloud infrastructure against hacks (improving security). Officials at the CFPB have described the new rules as an attempt to accelerate a shift toward “open banking,” and after an initial comment period on the new rules closed late last year, Rohit Chopra, the CFPB’s director, has said he would like to see the rule finalized by this fall.

Right now, uncountably many data brokers keep tabs on your buying habits. When you purchase something with a credit card, that transaction is shared with unknown third parties. When you get a car loan or a house mortgage, that information, along with your Social Security number and other sensitive data, is also shared with unknown third parties. You have no choice in the matter. The companies will freely tell you this in their disclaimers about personal information sharing: that you cannot opt-out of data sharing with “affiliate” companies. Since most of us can’t reasonably avoid getting a loan or using a credit card, we’re forced to share our data. Worse still, you don’t have a right to even see your data or vet it for accuracy, let alone limit its spread.

The CFPB’s simple and practical rules would fix this. The rules would ensure people can obtain their own financial data at no cost, control who it’s shared with and choose who they do business with in the financial industry. This would change the economics of consumer finance and the illicit data economy that exists today.

The best way for financial services firms to meet the CFPB’s rules would be to apply the decoupling principle broadly. Data is a toxic asset, and in the long run they’ll find that it’s better to not be sitting on a mountain of poorly secured financial data. Deleting the data is better for their users and reduces the chance they’ll incur expenses from a ransomware attack or breach settlement. As it stands, the collection and sale of consumer data is too lucrative for companies to say no to participating in the data broker economy, and the CFPB’s rules may help eliminate the incentive for companies to buy and sell these toxic assets. Moreover, in a free market for financial services, users will have the option to choose more responsible companies that also may be less expensive, thanks to savings from improved security.

Credit agencies and data brokers currently make money both from lenders requesting reports and from consumers requesting their data and seeking services that protect against data misuse. The CFPB’s new rules—and the technical changes necessary to comply with them—would eliminate many of those income streams. These companies have many roles, some of which we want and some we don’t, but as consumers we don’t have any choice in whether we participate in the buying and selling of our data. Giving people rights to their financial information would reduce the job of credit agencies to their core function: assessing risk of borrowers.

A free and properly regulated market for financial services also means choice and competition, something the industry is sorely in need of. Equifax, Transunion and Experian make up a longstanding oligopoly for credit reporting. Despite being responsible for one of the biggest data breaches of all time in 2017, the credit bureau Equifax is still around—illustrating that the oligopolistic nature of this market means that companies face few consequences for misbehavior.

On the banking side, the steady consolidation of the banking sector has resulted in a small number of very large banks holding most deposits and thus most financial data. Behind the scenes, a variety of financial data clearinghouses—companies most of us have never heard of—get breached all the time, losing our personal data to scammers, identity thieves and foreign governments.

The CFPB’s new rules would require institutions that deal with financial data to provide simple but essential functions to consumers that stand to deliver security benefits. This would include the use of application programming interfaces (APIs) for software, eliminating the barrier to interoperability presented by today’s baroque, non-standard and non-programmatic interfaces to access data. Each such interface would allow for interoperability and potential competition. The CFPB notes that some companies have tried to claim that their current systems provide security by being difficult to use. As security experts, we disagree: Such aging financial systems are notoriously insecure and simply rely upon security through obscurity.

Furthermore, greater standardization and openness in financial data with mechanisms for consumer privacy and control means fewer gatekeepers. The CFPB notes that a small number of data aggregators have emerged by virtue of the complexity and opaqueness of today’s systems. These aggregators provide little economic value to the country as a whole; they extract value from us all while hindering competition and dynamism. The few new entrants in this space have realized how valuable it is for them to present standard APIs for these systems while managing the ugly plumbing behind the scenes.

In addition, by eliminating the opacity of the current financial data ecosystem, the CFPB is able to add a new requirement of data traceability and certification: Companies can only use consumers’ data when absolutely necessary for providing a service the consumer wants. This would be another big win for consumer financial data privacy.

It might seem surprising that a set of rules designed to improve competition also improves security and privacy, but it shouldn’t. When companies can make business decisions without worrying about losing customers, security and privacy always suffer. Centralization of data also means centralization of control and economic power and a decline of competition.

If this rule is implemented it will represent an important, overdue step to improve competition, privacy and security. But there’s more that can and needs to be done. In time, we hope to see more regulatory frameworks that give consumers greater control of their data and increased adoption of the technology and architecture of decoupling to secure all of our personal data, wherever it may be.

This essay was written with Barath Raghavan, and was originally published in Cyberscoop.

Posted on January 31, 2024 at 7:04 AMView Comments

Microsoft Executives Hacked

Microsoft is reporting that a Russian intelligence agency—the same one responsible for the SolarWinds hack—accessed the email system of the company’s executives.

Beginning in late November 2023, the threat actor used a password spray attack to compromise a legacy non-production test tenant account and gain a foothold, and then used the account’s permissions to access a very small percentage of Microsoft corporate email accounts, including members of our senior leadership team and employees in our cybersecurity, legal, and other functions, and exfiltrated some emails and attached documents. The investigation indicates they were initially targeting email accounts for information related to Midnight Blizzard itself.

This is nutty. How does a “legacy non-production test tenant account” have access to executive emails? And why no two-factor authentication?

Posted on January 29, 2024 at 7:03 AMView Comments

Chatbots and Human Conversation

For most of history, communicating with a computer has not been like communicating with a person. In their earliest years, computers required carefully constructed instructions, delivered through punch cards; then came a command-line interface, followed by menus and options and text boxes. If you wanted results, you needed to learn the computer’s language.

This is beginning to change. Large language models—the technology undergirding modern chatbots—allow users to interact with computers through natural conversation, an innovation that introduces some baggage from human-to-human exchanges. Early on in our respective explorations of ChatGPT, the two of us found ourselves typing a word that we’d never said to a computer before: “Please.” The syntax of civility has crept into nearly every aspect of our encounters; we speak to this algebraic assemblage as if it were a person—even when we know that it’s not.

Right now, this sort of interaction is a novelty. But as chatbots become a ubiquitous element of modern life and permeate many of our human-computer interactions, they have the potential to subtly reshape how we think about both computers and our fellow human beings.

One direction that these chatbots may lead us in is toward a society where we ascribe humanity to AI systems, whether abstract chatbots or more physical robots. Just as we are biologically primed to see faces in objects, we imagine intelligence in anything that can hold a conversation. (This isn’t new: People projected intelligence and empathy onto the very primitive 1960s chatbot, Eliza.) We say “please” to LLMs because it feels wrong not to.

Chatbots are growing only more common, and there is reason to believe they will become ever more intimate parts of our lives. The market for AI companions, ranging from friends to romantic partners, is already crowded. Several companies are working on AI assistants, akin to secretaries or butlers, that will anticipate and satisfy our needs. And other companies are working on AI therapists, mediators, and life coaches—even simulacra of our dead relatives. More generally, chatbots will likely become the interface through which we interact with all sorts of computerized processes—an AI that responds to our style of language, every nuance of emotion, even tone of voice.

Many users will be primed to think of these AIs as friends, rather than the corporate-created systems that they are. The internet already spies on us through systems such as Meta’s advertising network, and LLMs will likely join in: OpenAI’s privacy policy, for example, already outlines the many different types of personal information the company collects. The difference is that the chatbots’ natural-language interface will make them feel more humanlike—reinforced with every politeness on both sides—and we could easily miscategorize them in our minds.

Major chatbots do not yet alter how they communicate with users to satisfy their parent company’s business interests, but market pressure might push things in that direction. Reached for comment about this, a spokesperson for OpenAI pointed to a section of the privacy policy noting that the company does not currently sell or share personal information for “cross-contextual behavioral advertising,” and that the company does not “process sensitive Personal Information for the purposes of inferring characteristics about a consumer.” In an interview with Axios earlier today, OpenAI CEO Sam Altman said future generations of AI may involve “quite a lot of individual customization,” and “that’s going to make a lot of people uncomfortable.”

Other computing technologies have been shown to shape our cognition. Studies indicate that autocomplete on websites and in word processors can dramatically reorganize our writing. Generally, these recommendations result in blander, more predictable prose. And where autocomplete systems give biased prompts, they result in biased writing. In one benign experiment, positive autocomplete suggestions led to more positive restaurant reviews, and negative autocomplete suggestions led to the reverse. The effects could go far beyond tweaking our writing styles to affecting our mental health, just as with the potentially depression- and anxiety-inducing social-media platforms of today.

The other direction these chatbots may take us is even more disturbing: into a world where our conversations with them result in our treating our fellow human beings with the apathy, disrespect, and incivility we more typically show machines.

Today’s chatbots perform best when instructed with a level of precision that would be appallingly rude in human conversation, stripped of any conversational pleasantries that the model could misinterpret: “Draft a 250-word paragraph in my typical writing style, detailing three examples to support the following point and cite your sources.” Not even the most detached corporate CEO would likely talk this way to their assistant, but it’s common with chatbots.

If chatbots truly become the dominant daily conversation partner for some people, there is an acute risk that these users will adopt a lexicon of AI commands even when talking to other humans. Rather than speaking with empathy, subtlety, and nuance, we’ll be trained to speak with the cold precision of a programmer talking to a computer. The colorful aphorisms and anecdotes that give conversations their inherently human quality, but that often confound large language models, could begin to vanish from the human discourse.

For precedent, one need only look at the ways that bot accounts already degrade digital discourse on social media, inflaming passions with crudely programmed responses to deeply emotional topics; they arguably played a role in sowing discord and polarizing voters in the 2016 election. But AI companions are likely to be a far larger part of some users’ social circle than the bots of today, potentially having a much larger impact on how those people use language and navigate relationships. What is unclear is whether this will negatively affect one user in a billion or a large portion of them.

Such a shift is unlikely to transform human conversations into cartoonishly robotic recitations overnight, but it could subtly and meaningfully reshape colloquial conversation over the course of years, just as the character limits of text messages affected so much of colloquial writing, turning terms such as LOL, IMO, and TMI into everyday vernacular.

AI chatbots are always there when you need them to be, for whatever you need them for. People aren’t like that. Imagine a future filled with people who have spent years conversing with their AI friends or romantic partners. Like a person whose only sexual experiences have been mediated by pornography or erotica, they could have unrealistic expectations of human partners. And the more ubiquitous and lifelike the chatbots become, the greater the impact could be.

More generally, AI might accelerate the disintegration of institutional and social trust. Technologies such as Facebook were supposed to bring the world together, but in the intervening years, the public has become more and more suspicious of the people around them and less trusting of civic institutions. AI may drive people further toward isolation and suspicion, always unsure whether the person they’re chatting with is actually a machine, and treating them as inhuman regardless.

Of course, history is replete with people claiming that the digital sky is falling, bemoaning each new invention as the end of civilization as we know it. In the end, LLMs may be little more than the word processor of tomorrow, a handy innovation that makes things a little easier while leaving most of our lives untouched. Which path we take depends on how we train the chatbots of tomorrow, but it also depends on whether we invest in strengthening the bonds of civil society today.

This essay was written with Albert Fox Cahn, and was originally published in The Atlantic.

Posted on January 26, 2024 at 7:09 AMView Comments

Poisoning AI Models

New research into poisoning AI models:

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later.

Research paper:

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

Posted on January 24, 2024 at 7:06 AMView Comments

Side Channels Are Common

Really interesting research: “Lend Me Your Ear: Passive Remote Physical Side Channels on PCs.”

Abstract:

We show that built-in sensors in commodity PCs, such as microphones, inadvertently capture electromagnetic side-channel leakage from ongoing computation. Moreover, this information is often conveyed by supposedly-benign channels such as audio recordings and common Voice-over-IP applications, even after lossy compression.

Thus, we show, it is possible to conduct physical side-channel attacks on computation by remote and purely passive analysis of commonly-shared channels. These attacks require neither physical proximity (which could be mitigated by distance and shielding), nor the ability to run code on the target or configure its hardware. Consequently, we argue, physical side channels on PCs can no longer be excluded from remote-attack threat models.

We analyze the computation-dependent leakage captured by internal microphones, and empirically demonstrate its efficacy for attacks. In one scenario, an attacker steals the secret ECDSA signing keys of the counterparty in a voice call. In another, the attacker detects what web page their counterparty is loading. In the third scenario, a player in the Counter-Strike online multiplayer game can detect a hidden opponent waiting in ambush, by analyzing how the 3D rendering done by the opponent’s computer induces faint but detectable signals into the opponent’s audio feed.

Posted on January 23, 2024 at 7:09 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.