Page 98

Putting Undetectable Backdoors in Machine Learning Models

This is really interesting research from a few months ago:

Abstract: Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners.We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.

First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input­a property we call non-replicability.

Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013).

Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

Turns out that securing ML systems is really hard.

Posted on February 24, 2023 at 7:34 AMView Comments

Cyberwar Lessons from the War in Ukraine

The Aspen Institute has published a good analysis of the successes, failures, and absences of cyberattacks as part of the current war in Ukraine: “The Cyber Defense Assistance Imperative ­ Lessons from Ukraine.”

Its conclusion:

Cyber defense assistance in Ukraine is working. The Ukrainian government and Ukrainian critical infrastructure organizations have better defended themselves and achieved higher levels of resiliency due to the efforts of CDAC and many others. But this is not the end of the road—the ability to provide cyber defense assistance will be important in the future. As a result, it is timely to assess how to provide organized, effective cyber defense assistance to safeguard the post-war order from potential aggressors.

The conflict in Ukraine is resetting the table across the globe for geopolitics and international security. The US and its allies have an imperative to strengthen the capabilities necessary to deter and respond to aggression that is ever more present in cyberspace. Lessons learned from the ad hoc conduct of cyber defense assistance in Ukraine can be institutionalized and scaled to provide new approaches and tools for preventing and managing cyber conflicts going forward.

I am often asked why where weren’t more successful cyberattacks by Russia against Ukraine. I generally give four reasons: (1) Cyberattacks are more effective in the “grey zone” between peace and war, and there are better alternatives once the shooting and bombing starts. (2) Setting these attacks up takes time, and Putin was secretive about his plans. (3) Putin was concerned about attacks spilling outside the war zone, and affecting other countries. (4) Ukrainian defenses were good, aided by other countries and companies. This paper gives a fifth reason: they were technically successful, but keeping them out of the news made them operationally unsuccessful.

Posted on February 23, 2023 at 7:27 AMView Comments

A Device to Turn Traffic Lights Green

Here’s a story about a hacker who reprogrammed a device called “Flipper Zero” to mimic Opticom transmitters—to turn traffic lights in his path green.

As mentioned earlier, the Flipper Zero has a built-in sub-GHz radio that lets the device receive data (or transmit it, with the right firmware in approved regions) on the same wireless frequencies as keyfobs and other devices. Most traffic preemption devices intended for emergency traffic redirection don’t actually transmit signals over RF. Instead, they use optical technology to beam infrared light from vehicles to static receivers mounted on traffic light poles.

Perhaps the most well-known branding for these types of devices is called Opticom. Essentially, the tech works by detecting a specific pattern of infrared light emitted by the Mobile Infrared Transmitter (MIRT) installed in a police car, fire truck, or ambulance when the MIRT is switched on. When the receiver detects the light, the traffic system then initiates a signal change as the emergency vehicle approaches an intersection, safely redirecting the traffic flow so that the emergency vehicle can pass through the intersection as if it were regular traffic and potentially avoid a collision.

This seems easy to do, but it’s also very illegal. It’s called “impersonating an emergency vehicle,” and it comes with hefty penalties if you’re caught.

Posted on February 22, 2023 at 7:30 AMView Comments

Fines as a Security System

Tile has an interesting security solution to make its tracking tags harder to use for stalking:

The Anti-Theft Mode feature will make the devices invisible to Scan and Secure, the company’s in-app feature that lets you know if any nearby Tiles are following you. But to activate the new Anti-Theft Mode, the Tile owner will have to verify their real identity with a government-issued ID, submit a biometric scan that helps root out fake IDs, agree to let Tile share their information with law enforcement and agree to be subject to a $1 million penalty if convicted in a court of law of using Tile for criminal activity. So although it technically makes the device easier for stalkers to use Tiles silently, it makes the penalty of doing so high enough to (at least in theory) deter them from trying.

Interesting theory. But it won’t work against attackers who don’t have any money.

Hulls believes the approach is superior to Apple’s solution with AirTag, which emits a sound and notifies iPhone users that one of the trackers is following them.

My complaint about the technical solutions is that they only work for users of the system. Tile security requires an “in-app feature.” Apple’s AirTag “notifies iPhone users.” What we need is a common standard that is implemented on all smartphones, so that people who don’t use the trackers can be alerted if they are being surveilled by one of them.

Posted on February 20, 2023 at 7:09 AMView Comments

Defending against AI Lobbyists

When is it time to start worrying about artificial intelligence interfering in our democracy? Maybe when an AI writes a letter to The New York Times opposing the regulation of its own technology.

That happened last month. And because the letter was responding to an essay we wrote, we’re starting to get worried. And while the technology can be regulated, the real solution lies in recognizing that the problem is human actors—and those we can do something about.

Our essay argued that the much heralded launch of the AI chatbot ChatGPT, a system that can generate text realistic enough to appear to be written by a human, poses significant threats to democratic processes. The ability to produce high quality political messaging quickly and at scale, if combined with AI-assisted capabilities to strategically target those messages to policymakers and the public, could become a powerful accelerant of an already sprawling and poorly constrained force in modern democratic life: lobbying.

We speculated that AI-assisted lobbyists could use generative models to write op-eds and regulatory comments supporting a position, identify members of Congress who wield the most influence over pending legislation, use network pattern identification to discover undisclosed or illegal political coordination, or use supervised machine learning to calibrate the optimal contribution needed to sway the vote of a legislative committee member.

These are all examples of what we call AI hacking. Hacks are strategies that follow the rules of a system, but subvert its intent. Currently a human creative process, future AIs could discover, develop, and execute these same strategies.

While some of these activities are the longtime domain of human lobbyists, AI tools applied against the same task would have unfair advantages. They can scale their activity effortlessly across every state in the country—human lobbyists tend to focus on a single state—they may uncover patterns and approaches unintuitive and unrecognizable by human experts, and do so nearly instantaneously with little chance for human decision makers to keep up.

These factors could make AI hacking of the democratic process fundamentally ungovernable. Any policy response to limit the impact of AI hacking on political systems would be critically vulnerable to subversion or control by an AI hacker. If AI hackers achieve unchecked influence over legislative processes, they could dictate the rules of our society: including the rules that govern AI.

We admit that this seemed far fetched when we first wrote about it in 2021. But now that the emanations and policy prescriptions of ChatGPT have been given an audience in the New York Times and innumerable other outlets in recent weeks, it’s getting harder to dismiss.

At least one group of researchers is already testing AI techniques to automatically find and advocate for bills that benefit a particular interest. And one Massachusetts representative used ChatGPT to draft legislation regulating AI.

The AI technology of two years ago seems quaint by the standards of ChatGPT. What will the technology of 2025 seem like if we could glimpse it today? To us there is no question that now is the time to act.

First, let’s dispense with the concepts that won’t work. We cannot solely rely on explicit regulation of AI technology development, distribution, or use. Regulation is essential, but it would be vastly insufficient. The rate of AI technology development, and the speed at which AI hackers might discover damaging strategies, already outpaces policy development, enactment, and enforcement.

Moreover, we cannot rely on detection of AI actors. The latest research suggests that AI models trying to classify text samples as human- or AI-generated have limited precision, and are ill equipped to handle real world scenarios. These reactive, defensive techniques will fail because the rate of advancement of the “offensive” generative AI is so astounding.

Additionally, we risk a dragnet that will exclude masses of human constituents that will use AI to help them express their thoughts, or machine translation tools to help them communicate. If a written opinion or strategy conforms to the intent of a real person, it should not matter if they enlisted the help of an AI (or a human assistant) to write it.

Most importantly, we should avoid the classic trap of societies wrenched by the rapid pace of change: privileging the status quo. Slowing down may seem like the natural response to a threat whose primary attribute is speed. Ideas like increasing requirements for human identity verification, aggressive detection regimes for AI-generated messages, and elongation of the legislative or regulatory process would all play into this fallacy. While each of these solutions may have some value independently, they do nothing to make the already powerful actors less powerful.

Finally, it won’t work to try to starve the beast. Large language models like ChatGPT have a voracious appetite for data. They are trained on past examples of the kinds of content that they will be asked to generate in the future. Similarly, an AI system built to hack political systems will rely on data that documents the workings of those systems, such as messages between constituents and legislators, floor speeches, chamber and committee voting results, contribution records, lobbying relationship disclosures, and drafts of and amendments to legislative text. The steady advancement towards the digitization and publication of this information that many jurisdictions have made is positive. The threat of AI hacking should not dampen or slow progress on transparency in public policymaking.

Okay, so what will help?

First, recognize that the true threats here are malicious human actors. Systems like ChatGPT and our still-hypothetical political-strategy AI are still far from artificial general intelligences. They do not think. They do not have free will. They are just tools directed by people, much like lobbyist for hire. And, like lobbyists, they will be available primarily to the richest individuals, groups, and their interests.

However, we can use the same tools that would be effective in controlling human political influence to curb AI hackers. These tools will be familiar to any follower of the last few decades of U.S. political history.

Campaign finance reforms such as contribution limits, particularly when applied to political action committees of all types as well as to candidate operated campaigns, can reduce the dependence of politicians on contributions from private interests. The unfair advantage of a malicious actor using AI lobbying tools is at least somewhat mitigated if a political target’s entire career is not already focused on cultivating a concentrated set of major donors.

Transparency also helps. We can expand mandatory disclosure of contributions and lobbying relationships, with provisions to prevent the obfuscation of the funding source. Self-interested advocacy should be transparently reported whether or not it was AI-assisted. Meanwhile, we should increase penalties for organizations that benefit from AI-assisted impersonation of constituents in political processes, and set a greater expectation of responsibility to avoid “unknowing” use of these tools on their behalf.

Our most important recommendation is less legal and more cultural. Rather than trying to make it harder for AI to participate in the political process, make it easier for humans to do so.

The best way to fight an AI that can lobby for moneyed interests is to help the little guy lobby for theirs. Promote inclusion and engagement in the political process so that organic constituent communications grow alongside the potential growth of AI-directed communications. Encourage direct contact that generates more-than-digital relationships between constituents and their representatives, which will be an enduring way to privilege human stakeholders. Provide paid leave to allow people to vote as well as to testify before their legislature and participate in local town meetings and other civic functions. Provide childcare and accessible facilities at civic functions so that more community members can participate.

The threat of AI hacking our democracy is legitimate and concerning, but its solutions are consistent with our democratic values. Many of the ideas above are good governance reforms already being pushed and fought over at the federal and state level.

We don’t need to reinvent our democracy to save it from AI. We just need to continue the work of building a just and equitable political system. Hopefully ChatGPT will give us all some impetus to do that work faster.

This essay was written with Nathan Sanders, and appeared on the Belfer Center blog.

Posted on February 17, 2023 at 7:33 AMView Comments

ChatGPT Is Ingesting Corporate Secrets

Interesting:

According to internal Slack messages that were leaked to Insider, an Amazon lawyer told workers that they had “already seen instances” of text generated by ChatGPT that “closely” resembled internal company data.

This issue seems to have come to a head recently because Amazon staffers and other tech workers throughout the industry have begun using ChatGPT as a “coding assistant” of sorts to help them write or improve strings of code, the report notes.

[…]

“This is important because your inputs may be used as training data for a further iteration of ChatGPT,” the lawyer wrote in the Slack messages viewed by Insider, “and we wouldn’t want its output to include or resemble our confidential information.”

Posted on February 16, 2023 at 7:06 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.