Blog: February 2023 Archives

Side-Channel Attack against CRYSTALS-Kyber

CRYSTALS-Kyber is one of the public-key algorithms currently recommended by NIST as part of its post-quantum cryptography standardization process.

Researchers have just published a side-channel attack—using power consumption—against an implementation of the algorithm that was supposed to be resistant against that sort of attack.

The algorithm is not “broken” or “cracked”—despite headlines to the contrary—this is just a side-channel attack. What makes this work really interesting is that the researchers used a machine-learning model to train the system to exploit the side channel.

Posted on February 28, 2023 at 7:19 AM8 Comments

Banning TikTok

Congress is currently debating bills that would ban TikTok in the United States. We are here as technologists to tell you that this is a terrible idea and the side effects would be intolerable. Details matter. There are several ways Congress might ban TikTok, each with different efficacies and side effects. In the end, all the effective ones would destroy the free Internet as we know it.

There’s no doubt that TikTok and ByteDance, the company that owns it, are shady. They, like most large corporations in China, operate at the pleasure of the Chinese government. They collect extreme levels of information about users. But they’re not alone: Many apps you use do the same, including Facebook and Instagram, along with seemingly innocuous apps that have no need for the data. Your data is bought and sold by data brokers you’ve never heard of who have few scruples about where the data ends up. They have digital dossiers on most people in the United States.

If we want to address the real problem, we need to enact serious privacy laws, not security theater, to stop our data from being collected, analyzed, and sold—by anyone. Such laws would protect us in the long term, and not just from the app of the week. They would also prevent data breaches and ransomware attacks from spilling our data out into the digital underworld, including hacker message boards and chat servers, hostile state actors, and outside hacker groups. And, most importantly, they would be compatible with our bedrock values of free speech and commerce, which Congress’s current strategies are not.

At best, the TikTok ban considered by Congress would be ineffective; at worst, a ban would force us to either adopt China’s censorship technology or create our own equivalent. The simplest approach, advocated by some in Congress, would be to ban the TikTok app from the Apple and Google app stores. This would immediately stop new updates for current users and prevent new users from signing up. To be clear, this would not reach into phones and remove the app. Nor would it prevent Americans from installing TikTok on their phones; they would still be able to get it from sites outside of the United States. Android users have long been able to use alternative app repositories. Apple maintains a tighter control over what apps are allowed on its phones, so users would have to “jailbreak”—or manually remove restrictions from—their devices to install TikTok.

Even if app access were no longer an option, TikTok would still be available more broadly. It is currently, and would still be, accessible from browsers, whether on a phone or a laptop. As long as the TikTok website is hosted on servers outside of the United States, the ban would not affect browser access.

Alternatively, Congress might take a financial approach and ban US companies from doing business with ByteDance. Then-President Donald Trump tried this in 2020, but it was blocked by the courts and rescinded by President Joe Biden a year later. This would shut off access to TikTok in app stores and also cut ByteDance off from the resources it needs to run TikTok. US cloud-computing and content-distribution networks would no longer distribute TikTok videos, collect user data, or run analytics. US advertisers—and this is critical—could no longer fork over dollars to ByteDance in the hopes of getting a few seconds of a user’s attention. TikTok, for all practical purposes, would cease to be a business in the United States.

But Americans would still be able to access TikTok through the loopholes discussed above. And they will: TikTok is one of the most popular apps ever made; about 70% of young people use it. There would be enormous demand for workarounds. ByteDance could choose to move its US-centric services right over the border to Canada, still within reach of American users. Videos would load slightly slower, but for today’s TikTok users, it would probably be acceptable. Without US advertisers ByteDance wouldn’t make much money, but it has operated at a loss for many years, so this wouldn’t be its death knell.

Finally, an even more restrictive approach Congress might take is actually the most dangerous: dangerous to Americans, not to TikTok. Congress might ban the use of TikTok by anyone in the United States. The Trump executive order would likely have had this effect, were it allowed to take effect. It required that US companies not engage in any sort of transaction with TikTok and prohibited circumventing the ban. . If the same restrictions were enacted by Congress instead, such a policy would leave business or technical implementation details to US companies, enforced through a variety of law enforcement agencies.

This would be an enormous change in how the Internet works in the United States. Unlike authoritarian states such as China, the US has a free, uncensored Internet. We have no technical ability to ban sites the government doesn’t like. Ironically, a blanket ban on the use of TikTok would necessitate a national firewall, like the one China currently has, to spy on and censor Americans’ access to the Internet. Or, at the least, authoritarian government powers like India’s, which could force Internet service providers to censor Internet traffic. Worse still, the main vendors of this censorship technology are in those authoritarian states. China, for example, sells its firewall technology to other censorship-loving autocracies such as Iran and Cuba.

All of these proposed solutions raise constitutional issues as well. The First Amendment protects speech and assembly. For example, the recently introduced Buck-Hawley bill, which instructs the president to use emergency powers to ban TikTok, might threaten separation of powers and may be relying on the same mechanisms used by Trump and stopped by the court. (Those specific emergency powers, provided by the International Emergency Economic Powers Act, have a specific exemption for communications services.) And individual states trying to beat Congress to the punch in regulating TikTok or social media generally might violate the Constitution’s Commerce Clause—which restricts individual states from regulating interstate commerce—in doing so.

Right now, there’s nothing to stop Americans’ data from ending up overseas. We’ve seen plenty of instances—from Zoom to Clubhouse to others—where data about Americans collected by US companies ends up in China, not by accident but because of how those companies managed their data. And the Chinese government regularly steals data from US organizations for its own use: Equifax, Marriott Hotels, and the Office of Personnel Management are examples.

If we want to get serious about protecting national security, we have to get serious about data privacy. Today, data surveillance is the business model of the Internet. Our personal lives have turned into data; it’s not possible to block it at our national borders. Our data has no nationality, no cost to copy, and, currently, little legal protection. Like water, it finds every crack and flows to every low place. TikTok won’t be the last app or service from abroad that becomes popular, and it is distressingly ordinary in terms of how much it spies on us. Personal privacy is now a matter of national security. That needs to be part of any debate about banning TikTok.

This essay was written with Barath Raghavan, and previously appeared in Foreign Policy.

EDITED TO ADD (3/13): Glenn Gerstell, former general counsel of the NSA, has similar things to say.

Posted on February 27, 2023 at 7:06 AM49 Comments

Putting Undetectable Backdoors in Machine Learning Models

This is really interesting research from a few months ago:

Abstract: Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners.We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.

First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input­a property we call non-replicability.

Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013).

Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

Turns out that securing ML systems is really hard.

Posted on February 24, 2023 at 7:34 AM49 Comments

Cyberwar Lessons from the War in Ukraine

The Aspen Institute has published a good analysis of the successes, failures, and absences of cyberattacks as part of the current war in Ukraine: “The Cyber Defense Assistance Imperative ­ Lessons from Ukraine.”

Its conclusion:

Cyber defense assistance in Ukraine is working. The Ukrainian government and Ukrainian critical infrastructure organizations have better defended themselves and achieved higher levels of resiliency due to the efforts of CDAC and many others. But this is not the end of the road—the ability to provide cyber defense assistance will be important in the future. As a result, it is timely to assess how to provide organized, effective cyber defense assistance to safeguard the post-war order from potential aggressors.

The conflict in Ukraine is resetting the table across the globe for geopolitics and international security. The US and its allies have an imperative to strengthen the capabilities necessary to deter and respond to aggression that is ever more present in cyberspace. Lessons learned from the ad hoc conduct of cyber defense assistance in Ukraine can be institutionalized and scaled to provide new approaches and tools for preventing and managing cyber conflicts going forward.

I am often asked why where weren’t more successful cyberattacks by Russia against Ukraine. I generally give four reasons: (1) Cyberattacks are more effective in the “grey zone” between peace and war, and there are better alternatives once the shooting and bombing starts. (2) Setting these attacks up takes time, and Putin was secretive about his plans. (3) Putin was concerned about attacks spilling outside the war zone, and affecting other countries. (4) Ukrainian defenses were good, aided by other countries and companies. This paper gives a fifth reason: they were technically successful, but keeping them out of the news made them operationally unsuccessful.

Posted on February 23, 2023 at 7:27 AM38 Comments

A Device to Turn Traffic Lights Green

Here’s a story about a hacker who reprogrammed a device called “Flipper Zero” to mimic Opticom transmitters—to turn traffic lights in his path green.

As mentioned earlier, the Flipper Zero has a built-in sub-GHz radio that lets the device receive data (or transmit it, with the right firmware in approved regions) on the same wireless frequencies as keyfobs and other devices. Most traffic preemption devices intended for emergency traffic redirection don’t actually transmit signals over RF. Instead, they use optical technology to beam infrared light from vehicles to static receivers mounted on traffic light poles.

Perhaps the most well-known branding for these types of devices is called Opticom. Essentially, the tech works by detecting a specific pattern of infrared light emitted by the Mobile Infrared Transmitter (MIRT) installed in a police car, fire truck, or ambulance when the MIRT is switched on. When the receiver detects the light, the traffic system then initiates a signal change as the emergency vehicle approaches an intersection, safely redirecting the traffic flow so that the emergency vehicle can pass through the intersection as if it were regular traffic and potentially avoid a collision.

This seems easy to do, but it’s also very illegal. It’s called “impersonating an emergency vehicle,” and it comes with hefty penalties if you’re caught.

Posted on February 22, 2023 at 7:30 AM25 Comments

Fines as a Security System

Tile has an interesting security solution to make its tracking tags harder to use for stalking:

The Anti-Theft Mode feature will make the devices invisible to Scan and Secure, the company’s in-app feature that lets you know if any nearby Tiles are following you. But to activate the new Anti-Theft Mode, the Tile owner will have to verify their real identity with a government-issued ID, submit a biometric scan that helps root out fake IDs, agree to let Tile share their information with law enforcement and agree to be subject to a $1 million penalty if convicted in a court of law of using Tile for criminal activity. So although it technically makes the device easier for stalkers to use Tiles silently, it makes the penalty of doing so high enough to (at least in theory) deter them from trying.

Interesting theory. But it won’t work against attackers who don’t have any money.

Hulls believes the approach is superior to Apple’s solution with AirTag, which emits a sound and notifies iPhone users that one of the trackers is following them.

My complaint about the technical solutions is that they only work for users of the system. Tile security requires an “in-app feature.” Apple’s AirTag “notifies iPhone users.” What we need is a common standard that is implemented on all smartphones, so that people who don’t use the trackers can be alerted if they are being surveilled by one of them.

Posted on February 20, 2023 at 7:09 AM19 Comments

Defending against AI Lobbyists

When is it time to start worrying about artificial intelligence interfering in our democracy? Maybe when an AI writes a letter to The New York Times opposing the regulation of its own technology.

That happened last month. And because the letter was responding to an essay we wrote, we’re starting to get worried. And while the technology can be regulated, the real solution lies in recognizing that the problem is human actors—and those we can do something about.

Our essay argued that the much heralded launch of the AI chatbot ChatGPT, a system that can generate text realistic enough to appear to be written by a human, poses significant threats to democratic processes. The ability to produce high quality political messaging quickly and at scale, if combined with AI-assisted capabilities to strategically target those messages to policymakers and the public, could become a powerful accelerant of an already sprawling and poorly constrained force in modern democratic life: lobbying.

We speculated that AI-assisted lobbyists could use generative models to write op-eds and regulatory comments supporting a position, identify members of Congress who wield the most influence over pending legislation, use network pattern identification to discover undisclosed or illegal political coordination, or use supervised machine learning to calibrate the optimal contribution needed to sway the vote of a legislative committee member.

These are all examples of what we call AI hacking. Hacks are strategies that follow the rules of a system, but subvert its intent. Currently a human creative process, future AIs could discover, develop, and execute these same strategies.

While some of these activities are the longtime domain of human lobbyists, AI tools applied against the same task would have unfair advantages. They can scale their activity effortlessly across every state in the country—human lobbyists tend to focus on a single state—they may uncover patterns and approaches unintuitive and unrecognizable by human experts, and do so nearly instantaneously with little chance for human decision makers to keep up.

These factors could make AI hacking of the democratic process fundamentally ungovernable. Any policy response to limit the impact of AI hacking on political systems would be critically vulnerable to subversion or control by an AI hacker. If AI hackers achieve unchecked influence over legislative processes, they could dictate the rules of our society: including the rules that govern AI.

We admit that this seemed far fetched when we first wrote about it in 2021. But now that the emanations and policy prescriptions of ChatGPT have been given an audience in the New York Times and innumerable other outlets in recent weeks, it’s getting harder to dismiss.

At least one group of researchers is already testing AI techniques to automatically find and advocate for bills that benefit a particular interest. And one Massachusetts representative used ChatGPT to draft legislation regulating AI.

The AI technology of two years ago seems quaint by the standards of ChatGPT. What will the technology of 2025 seem like if we could glimpse it today? To us there is no question that now is the time to act.

First, let’s dispense with the concepts that won’t work. We cannot solely rely on explicit regulation of AI technology development, distribution, or use. Regulation is essential, but it would be vastly insufficient. The rate of AI technology development, and the speed at which AI hackers might discover damaging strategies, already outpaces policy development, enactment, and enforcement.

Moreover, we cannot rely on detection of AI actors. The latest research suggests that AI models trying to classify text samples as human- or AI-generated have limited precision, and are ill equipped to handle real world scenarios. These reactive, defensive techniques will fail because the rate of advancement of the “offensive” generative AI is so astounding.

Additionally, we risk a dragnet that will exclude masses of human constituents that will use AI to help them express their thoughts, or machine translation tools to help them communicate. If a written opinion or strategy conforms to the intent of a real person, it should not matter if they enlisted the help of an AI (or a human assistant) to write it.

Most importantly, we should avoid the classic trap of societies wrenched by the rapid pace of change: privileging the status quo. Slowing down may seem like the natural response to a threat whose primary attribute is speed. Ideas like increasing requirements for human identity verification, aggressive detection regimes for AI-generated messages, and elongation of the legislative or regulatory process would all play into this fallacy. While each of these solutions may have some value independently, they do nothing to make the already powerful actors less powerful.

Finally, it won’t work to try to starve the beast. Large language models like ChatGPT have a voracious appetite for data. They are trained on past examples of the kinds of content that they will be asked to generate in the future. Similarly, an AI system built to hack political systems will rely on data that documents the workings of those systems, such as messages between constituents and legislators, floor speeches, chamber and committee voting results, contribution records, lobbying relationship disclosures, and drafts of and amendments to legislative text. The steady advancement towards the digitization and publication of this information that many jurisdictions have made is positive. The threat of AI hacking should not dampen or slow progress on transparency in public policymaking.

Okay, so what will help?

First, recognize that the true threats here are malicious human actors. Systems like ChatGPT and our still-hypothetical political-strategy AI are still far from artificial general intelligences. They do not think. They do not have free will. They are just tools directed by people, much like lobbyist for hire. And, like lobbyists, they will be available primarily to the richest individuals, groups, and their interests.

However, we can use the same tools that would be effective in controlling human political influence to curb AI hackers. These tools will be familiar to any follower of the last few decades of U.S. political history.

Campaign finance reforms such as contribution limits, particularly when applied to political action committees of all types as well as to candidate operated campaigns, can reduce the dependence of politicians on contributions from private interests. The unfair advantage of a malicious actor using AI lobbying tools is at least somewhat mitigated if a political target’s entire career is not already focused on cultivating a concentrated set of major donors.

Transparency also helps. We can expand mandatory disclosure of contributions and lobbying relationships, with provisions to prevent the obfuscation of the funding source. Self-interested advocacy should be transparently reported whether or not it was AI-assisted. Meanwhile, we should increase penalties for organizations that benefit from AI-assisted impersonation of constituents in political processes, and set a greater expectation of responsibility to avoid “unknowing” use of these tools on their behalf.

Our most important recommendation is less legal and more cultural. Rather than trying to make it harder for AI to participate in the political process, make it easier for humans to do so.

The best way to fight an AI that can lobby for moneyed interests is to help the little guy lobby for theirs. Promote inclusion and engagement in the political process so that organic constituent communications grow alongside the potential growth of AI-directed communications. Encourage direct contact that generates more-than-digital relationships between constituents and their representatives, which will be an enduring way to privilege human stakeholders. Provide paid leave to allow people to vote as well as to testify before their legislature and participate in local town meetings and other civic functions. Provide childcare and accessible facilities at civic functions so that more community members can participate.

The threat of AI hacking our democracy is legitimate and concerning, but its solutions are consistent with our democratic values. Many of the ideas above are good governance reforms already being pushed and fought over at the federal and state level.

We don’t need to reinvent our democracy to save it from AI. We just need to continue the work of building a just and equitable political system. Hopefully ChatGPT will give us all some impetus to do that work faster.

This essay was written with Nathan Sanders, and appeared on the Belfer Center blog.

Posted on February 17, 2023 at 7:33 AM40 Comments

ChatGPT Is Ingesting Corporate Secrets

Interesting:

According to internal Slack messages that were leaked to Insider, an Amazon lawyer told workers that they had “already seen instances” of text generated by ChatGPT that “closely” resembled internal company data.

This issue seems to have come to a head recently because Amazon staffers and other tech workers throughout the industry have begun using ChatGPT as a “coding assistant” of sorts to help them write or improve strings of code, the report notes.

[…]

“This is important because your inputs may be used as training data for a further iteration of ChatGPT,” the lawyer wrote in the Slack messages viewed by Insider, “and we wouldn’t want its output to include or resemble our confidential information.”

Posted on February 16, 2023 at 7:06 AM152 Comments

Upcoming Speaking Engagements

This is a current list of where and when I am scheduled to speak:

The list is maintained on this page.

Posted on February 14, 2023 at 11:54 AM2 Comments

What Will It Take?

What will it take for policy makers to take cybersecurity seriously? Not minimal-change seriously. Not here-and-there seriously. But really seriously. What will it take for policy makers to take cybersecurity seriously enough to enact substantive legislative changes that would address the problems? It’s not enough for the average person to be afraid of cyberattacks. They need to know that there are engineering fixes—and that’s something we can provide.

For decades, I have been waiting for the “big enough” incident that would finally do it. In 2015, Chinese military hackers hacked the Office of Personal Management and made off with the highly personal information of about 22 million Americans who had security clearances. In 2016, the Mirai botnet leveraged millions of Internet-of-Things devices with default admin passwords to launch a denial-of-service attack that disabled major Internet platforms and services in both North America and Europe. In 2017, hackers—years later we learned that it was the Chinese military—hacked the credit bureau Equifax and stole the personal information of 147 million Americans. In recent years, ransomware attacks have knocked hospitals offline, and many articles have been written about Russia inside the U.S. power grid. And last year, the Russian SVR hacked thousands of sensitive networks inside civilian critical infrastructure worldwide in what we’re now calling Sunburst (and used to call SolarWinds).

Those are all major incidents to security people, but think about them from the perspective of the average person. Even the most spectacular failures don’t affect 99.9% of the country. Why should anyone care if the Chinese have his or her credit records? Or if the Russians are stealing data from some government network? Few of us have been directly affected by ransomware, and a temporary Internet outage is just temporary.

Cybersecurity has never been a campaign issue. It isn’t a topic that shows up in political debates. (There was one question in a 2016 Clinton–Trump debate, but the response was predictably unsubstantive.) This just isn’t an issue that most people prioritize, or even have an opinion on.

So, what will it take? Many of my colleagues believe that it will have to be something with extreme emotional intensity—sensational, vivid, salient—that results in large-scale loss of life or property damage. A successful attack that actually poisons a water supply, as someone tried to do in January by raising the levels of lye at a Florida water-treatment plant. (That one was caught early.) Or an attack that disables Internet-connected cars at speed, something that was demonstrated by researchers in 2014. Or an attack on the power grid, similar to what Russia did to the Ukraine in 2015 and 2016. Will it take gas tanks exploding and planes falling out of the sky for the average person to read about the casualties and think “that could have been me”?

Here’s the real problem. For the average nonexpert—and in this category I include every lawmaker—to push for change, they not only need to believe that the present situation is intolerable, they also need to believe that an alternative is possible. Real legislative change requires a belief that the never-ending stream of hacks and attacks is not inevitable, that we can do better. And that will require creating working examples of secure, dependable, resilient systems.

Providing alternatives is how engineers help facilitate social change. We could never have eliminated sales of tungsten-filament household light bulbs if fluorescent and LED replacements hadn’t become available. Reducing the use of fossil fuel for electricity generation requires working wind turbines and cost-effective solar cells.

We need to demonstrate that it’s possible to build systems that can defend themselves against hackers, criminals, and national intelligence agencies; secure Internet-of-Things systems; and systems that can reestablish security after a breach. We need to prove that hacks aren’t inevitable, and that our vulnerability is a choice. Only then can someone decide to choose differently. When people die in a cyberattack and everyone asks “What can be done?” we need to have something to tell them.

We don’t yet have the technology to build a truly safe, secure, and resilient Internet and the computers that connect to it. Yes, we have lots of security technologies. We have older secure systems—anyone still remember Apollo’s DomainOS and MULTICS?—that lost out in a market that didn’t reward security. We have newer research ideas and products that aren’t successful because the market still doesn’t reward security. We have even newer research ideas that won’t be deployed, again, because the market still prefers convenience over security.

What I am proposing is something more holistic, an engineering research task on a par with the Internet itself. The Internet was designed and built to answer this question: Can we build a reliable network out of unreliable parts in an unreliable world? It turned out the answer was yes, and the Internet was the result. I am asking a similar research question: Can we build a secure network out of insecure parts in an insecure world? The answer isn’t obviously yes, but it isn’t obviously no, either.

While any successful demonstration will include many of the security technologies we know and wish would see wider use, it’s much more than that. Creating a secure Internet ecosystem goes beyond old-school engineering to encompass the social sciences. It will include significant economic, institutional, and psychological considerations that just weren’t present in the first few decades of Internet research.

Cybersecurity isn’t going to get better until the economic incentives change, and that’s not going to change until the political incentives change. The political incentives won’t change until there is political liability that comes from voter demands. Those demands aren’t going to be solely the results of insecurity. They will also be the result of believing that there’s a better alternative. It is our task to research, design, build, test, and field that better alternative—even though the market couldn’t care less right now.

This essay originally appeared in the May/June 2021 issue of IEEE Security & Privacy. I forgot to publish it here.

Posted on February 14, 2023 at 7:06 AM73 Comments

A Hacker’s Mind Is Now Published

Tuesday was the official publication date of A Hacker’s Mind: How the Powerful Bend Society’s Rules, and How to Bend them Back. It broke into the 2000s on the Amazon best-seller list.

Reviews in the New York Times, Cory Doctorow’s blog, Science, and the Associated Press.

I wrote essays related to the book for CNN and John Scalzi’s blog.

Two podcast interviews: Keen On and Lawfare. And a written interview for the Ash Center at the Harvard Kennedy School.

Lots more coming, I believe. Get your copy here.

And—last request—right now there’s one Amazon review, and it’s not a good one. If people here could leave reviews, I would appreciate it.

Posted on February 10, 2023 at 3:03 PM9 Comments

Hacking the Tax Code

The tax code isn’t software. It doesn’t run on a computer. But it’s still code. It’s a series of algorithms that takes an input—financial information for the year—and produces an output: the amount of tax owed. It’s incredibly complex code; there are a bazillion details and exceptions and special cases. It consists of government laws, rulings from the tax authorities, judicial decisions, and legal opinions.

Like computer code, the tax code has bugs. They might be mistakes in how the tax laws were written. They might be mistakes in how the tax code is interpreted, oversights in how parts of the law were conceived, or unintended omissions of some sort or another. They might arise from the exponentially huge number of ways different parts of the tax code interact.

A recent example comes from the 2017 Tax Cuts and Jobs Act. That law was drafted in both haste and secret, and quickly passed without any time for review—or even proofreading. One of the things in it was a typo that accidentally categorized military death benefits as earned income. The practical effect of that mistake is that surviving family members were hit with surprise tax bills of US$10,000 or more.

That’s a bug, but not a vulnerability. An example of a vulnerability is the “Double Irish with a Dutch Sandwich.” It arises from the interactions of tax laws in multiple countries, and it’s how companies like Google and Apple have avoided paying U.S. taxes despite being U.S. companies. Estimates are that U.S. companies avoided paying nearly US$200 billion in taxes in 2017 alone.

In the tax world, vulnerabilities are called loopholes. Exploits are called tax avoidance strategies. And there are thousands of black-hat researchers who examine every line of the tax code looking for exploitable vulnerabilities—tax attorneys and tax accountants.

Some vulnerabilities are deliberately created. Lobbyists are constantly trying to insert this or that provision into the tax code that benefits their clients financially. That same 2017 U.S. tax law included a special tax break for oil and gas investment partnerships, a special exemption that ensures that fewer than 1 in 1,000 estates will have to pay estate tax, and language specifically expanding a pass-through loophole that industry uses to incorporate companies offshore and avoid U.S. taxes. That’s not hacking the tax code. It’s hacking the processes that create them: the legislative process that creates tax law.

We know the processes to use to fix vulnerabilities in computer code. Before the code is finished, we can employ some sort of secure development processes, with automatic bug-finding tools and maybe source code audits. After the code is deployed, we might rely on vulnerability finding by the security community, perhaps bug bounties—and most of all, quick patching when vulnerabilities are discovered.

What does it mean to “patch” the tax code? Passing any tax legislation is a big deal, especially in the United States where the issue is so partisan and contentious. (That 2017 earned income tax bug for military families hasn’t yet been fixed. And that’s an easy one; everyone acknowledges it was a mistake.) We don’t have the ability to patch tax code with anywhere near the same agility that we have to patch software.

We can patch some vulnerabilities, though. The other way tax code is modified is by IRS and judicial rulings. The 2017 tax law capped income tax deductions for property taxes. This provision didn’t come into force in 2018, so someone came up with the clever hack to prepay 2018 property taxes in 2017. Just before the end of the year, the IRS ruled about when that was legal and when it wasn’t. Short answer: most of the time, it wasn’t.

There’s another option: that the vulnerability isn’t patched and isn’t explicitly approved, and slowly becomes part of the normal way of doing things. Lots of tax loopholes end up like this. Sometimes they’re even given retroactive legality by the IRS or Congress after a constituency and lobbying effort gets behind them. This process is how systems evolve. A hack subverts the intent of a system. Whatever governing system has jurisdiction either blocks the hack or allows it—or does nothing and the hack becomes the new normal.

Here’s my question: what happens when artificial intelligence and machine learning (ML) gets hold of this problem? We already have ML systems that find software vulnerabilities. What happens when you feed a ML system the entire U.S. tax code and tell it to figure out all of the ways to minimize the amount of tax owed? Or, in the case of a multinational corporation, to feed it the entire planet’s tax codes? What sort of vulnerabilities would it find? And how many? Dozens or millions?

In 2015, Volkswagen was caught cheating on emissions control tests. It didn’t forge test results; it got the cars’ computers to cheat for them. Engineers programmed the software in the car’s onboard computer to detect when the car was undergoing an emissions test. The computer then activated the car’s emissions-curbing systems, but only for the duration of the test. The result was that the cars had much better performance on the road at the cost of producing more pollution.

ML will result in lots of hacks like this. They’ll be more subtle. They’ll be even harder to discover. It’s because of the way ML systems optimize themselves, and because their specific optimizations can be impossible for us humans to understand. Their human programmers won’t even know what’s going on.

Any good ML system will naturally find and exploit hacks. This is because their only constraints are the rules of the system. If there are problems, inconsistencies, or loopholes in the rules, and if those properties lead to a “better” solution as defined by the program, then those systems will find them. The challenge is that you have to define the system’s goals completely and precisely, and that that’s impossible.

The tax code can be hacked. Financial markets regulations can be hacked. The market economy, democracy itself, and our cognitive systems can all be hacked. Tasking a ML system to find new hacks against any of these is still science fiction, but it’s not stupid science fiction. And ML will drastically change how we need to think about policy, law, and government. Now’s the time to figure out how.

This essay originally appeared in the September/October 2020 issue of IEEE Security & Privacy. I wrote it when I started writing my latest book, but never published it here.

Posted on February 10, 2023 at 6:24 AM60 Comments

Mary Queen of Scots Letters Decrypted

This is a neat piece of historical research.

The team of computer scientist George Lasry, pianist Norbert Biermann and astrophysicist Satoshi Tomokiyo—all keen cryptographers—initially thought the batch of encoded documents related to Italy, because that was how they were filed at the Bibliothèque Nationale de France.

However, they quickly realised the letters were in French. Many verb and adjectival forms being feminine, regular mention of captivity, and recurring names—such as Walsingham—all put them on the trail of Mary. Sir Francis Walsingham was Queen Elizabeth’s spymaster.

The code was a simple replacement system in which symbols stand either for letters, or for common words and names. But it would still have taken centuries to crunch all the possibilities, so the team used an algorithm that homed in on likely solutions.

Academic paper.

EDITED TO ADD (2/13): More news.

Posted on February 9, 2023 at 7:15 AM13 Comments

SolarWinds and Market Incentives

In early 2021, IEEE Security and Privacy asked a number of board members for brief perspectives on the SolarWinds incident while it was still breaking news. This was my response.

The penetration of government and corporate networks worldwide is the result of inadequate cyberdefenses across the board. The lessons are many, but I want to focus on one important one we’ve learned: the software that’s managing our critical networks isn’t secure, and that’s because the market doesn’t reward that security.

SolarWinds is a perfect example. The company was the initial infection vector for much of the operation. Its trusted position inside so many critical networks made it a perfect target for a supply-chain attack, and its shoddy security practices made it an easy target.

Why did SolarWinds have such bad security? The answer is because it was more profitable. The company is owned by Thoma Bravo partners, a private-equity firm known for radical cost-cutting in the name of short-term profit. Under CEO Kevin Thompson, the company underspent on security even as it outsourced software development. The New York Times reports that the company’s cybersecurity advisor quit after his “basic recommendations were ignored.” In a very real sense, SolarWinds profited because it secretly shifted a whole bunch of risk to its customers: the US government, IT companies, and others.

This problem isn’t new, and, while it’s exacerbated by the private-equity funding model, it’s not unique to it. In general, the market doesn’t reward safety and security—especially when the effects of ignoring those things are long term and diffuse. The market rewards short-term profits at the expense of safety and security. (Watch and see whether SolarWinds suffers any long-term effects from this hack, or whether Thoma Bravo’s bet that it could profit by selling an insecure product was a good one.)

The solution here is twofold. The first is to improve government software procurement. Software is now critical to national security. Any system of procuring that software needs to evaluate the security of the software and the security practices of the company, in detail, to ensure that they are sufficient to meet the security needs of the network they’re being installed in. If these evaluations are made public, along with the list of companies that meet them, all network buyers can benefit from them. It’s a win for everybody.

But that isn’t enough; we need a second part. The only way to force companies to provide safety and security features for customers is through regulation. This is true whether we want seat belts in our cars, basic food safety at our restaurants, pajamas that don’t catch on fire, or home routers that aren’t vulnerable to cyberattack. The government needs to set minimum security standards for software that’s used in critical network applications, just as it sets software standards for avionics.

Without these two measures, it’s just too easy for companies to act like SolarWinds: save money by skimping on safety and security and hope for the best in the long term. That’s the rational thing for companies to do in an unregulated market, and the only way to change that is to change the economic incentives.

This essay originally appeared in the March/April 2021 issue of IEEE Security & Privacy.” I forgot to publish it here.

Posted on February 8, 2023 at 6:46 AM19 Comments

Malware Delivered through Google Search

Criminals using Google search ads to deliver malware isn’t new, but Ars Technica declared that the problem has become much worse recently.

The surge is coming from numerous malware families, including AuroraStealer, IcedID, Meta Stealer, RedLine Stealer, Vidar, Formbook, and XLoader. In the past, these families typically relied on phishing and malicious spam that attached Microsoft Word documents with booby-trapped macros. Over the past month, Google Ads has become the go-to place for criminals to spread their malicious wares that are disguised as legitimate downloads by impersonating brands such as Adobe Reader, Gimp, Microsoft Teams, OBS, Slack, Tor, and Thunderbird.

[…]

It’s clear that despite all the progress Google has made filtering malicious sites out of returned ads and search results over the past couple decades, criminals have found ways to strike back. These criminals excel at finding the latest techniques to counter the filtering. As soon as Google devises a way to block them, the criminals figure out new ways to circumvent those protections.

Posted on February 7, 2023 at 7:23 AM14 Comments

Attacking Machine Learning Systems

The field of machine learning (ML) security—and corresponding adversarial ML—is rapidly advancing as researchers develop sophisticated techniques to perturb, disrupt, or steal the ML model or data. It’s a heady time; because we know so little about the security of these systems, there are many opportunities for new researchers to publish in this field. In many ways, this circumstance reminds me of the cryptanalysis field in the 1990. And there is a lesson in that similarity: the complex mathematical attacks make for good academic papers, but we mustn’t lose sight of the fact that insecure software will be the likely attack vector for most ML systems.

We are amazed by real-world demonstrations of adversarial attacks on ML systems, such as a 3D-printed object that looks like a turtle but is recognized (from any orientation) by the ML system as a gun. Or adding a few stickers that look like smudges to a stop sign so that it is recognized by a state-of-the-art system as a 45 mi/h speed limit sign. But what if, instead, somebody hacked into the system and just switched the labels for “gun” and “turtle” or swapped “stop” and “45 mi/h”? Systems can only match images with human-provided labels, so the software would never notice the switch. That is far easier and will remain a problem even if systems are developed that are robust to those adversarial attacks.

At their core, modern ML systems have complex mathematical models that use training data to become competent at a task. And while there are new risks inherent in the ML model, all of that complexity still runs in software. Training data are still stored in memory somewhere. And all of that is on a computer, on a network, and attached to the Internet. Like everything else, these systems will be hacked through vulnerabilities in those more conventional parts of the system.

This shouldn’t come as a surprise to anyone who has been working with Internet security. Cryptography has similar vulnerabilities. There is a robust field of cryptanalysis: the mathematics of code breaking. Over the last few decades, we in the academic world have developed a variety of cryptanalytic techniques. We have broken ciphers we previously thought secure. This research has, in turn, informed the design of cryptographic algorithms. The classified world of the NSA and its foreign counterparts have been doing the same thing for far longer. But aside from some special cases and unique circumstances, that’s not how encryption systems are exploited in practice. Outside of academic papers, cryptosystems are largely bypassed because everything around the cryptography is much less secure.

I wrote this in my book, Data and Goliath:

The problem is that encryption is just a bunch of math, and math has no agency. To turn that encryption math into something that can actually provide some security for you, it has to be written in computer code. And that code needs to run on a computer: one with hardware, an operating system, and other software. And that computer needs to be operated by a person and be on a network. All of those things will invariably introduce vulnerabilities that undermine the perfection of the mathematics…

This remains true even for pretty weak cryptography. It is much easier to find an exploitable software vulnerability than it is to find a cryptographic weakness. Even cryptographic algorithms that we in the academic community regard as “broken”—meaning there are attacks that are more efficient than brute force—are usable in the real world because the difficulty of breaking the mathematics repeatedly and at scale is much greater than the difficulty of breaking the computer system that the math is running on.

ML systems are similar. Systems that are vulnerable to model stealing through the careful construction of queries are more vulnerable to model stealing by hacking into the computers they’re stored in. Systems that are vulnerable to model inversion—this is where attackers recover the training data through carefully constructed queries—are much more vulnerable to attacks that take advantage of unpatched vulnerabilities.

But while security is only as strong as the weakest link, this doesn’t mean we can ignore either cryptography or ML security. Here, our experience with cryptography can serve as a guide. Cryptographic attacks have different characteristics than software and network attacks, something largely shared with ML attacks. Cryptographic attacks can be passive. That is, attackers who can recover the plaintext from nothing other than the ciphertext can eavesdrop on the communications channel, collect all of the encrypted traffic, and decrypt it on their own systems at their own pace, perhaps in a giant server farm in Utah. This is bulk surveillance and can easily operate on this massive scale.

On the other hand, computer hacking has to be conducted one target computer at a time. Sure, you can develop tools that can be used again and again. But you still need the time and expertise to deploy those tools against your targets, and you have to do so individually. This means that any attacker has to prioritize. So while the NSA has the expertise necessary to hack into everyone’s computer, it doesn’t have the budget to do so. Most of us are simply too low on its priorities list to ever get hacked. And that’s the real point of strong cryptography: it forces attackers like the NSA to prioritize.

This analogy only goes so far. ML is not anywhere near as mathematically sound as cryptography. Right now, it is a sloppy misunderstood mess: hack after hack, kludge after kludge, built on top of each other with some data dependency thrown in. Directly attacking an ML system with a model inversion attack or a perturbation attack isn’t as passive as eavesdropping on an encrypted communications channel, but it’s using the ML system as intended, albeit for unintended purposes. It’s much safer than actively hacking the network and the computer that the ML system is running on. And while it doesn’t scale as well as cryptanalytic attacks can—and there likely will be a far greater variety of ML systems than encryption algorithms—it has the potential to scale better than one-at-a-time computer hacking does. So here again, good ML security denies attackers all of those attack vectors.

We’re still in the early days of studying ML security, and we don’t yet know the contours of ML security techniques. There are really smart people working on this and making impressive progress, and it’ll be years before we fully understand it. Attacks come easy, and defensive techniques are regularly broken soon after they’re made public. It was the same with cryptography in the 1990s, but eventually the science settled down as people better understood the interplay between attack and defense. So while Google, Amazon, Microsoft, and Tesla have all faced adversarial ML attacks on their production systems in the last three years, that’s not going to be the norm going forward.

All of this also means that our security for ML systems depends largely on the same conventional computer security techniques we’ve been using for decades. This includes writing vulnerability-free software, designing user interfaces that help resist social engineering, and building computer networks that aren’t full of holes. It’s the same risk-mitigation techniques that we’ve been living with for decades. That we’re still mediocre at it is cause for concern, with regard to both ML systems and computing in general.

I love cryptography and cryptanalysis. I love the elegance of the mathematics and the thrill of discovering a flaw—or even of reading and understanding a flaw that someone else discovered—in the mathematics. It feels like security in its purest form. Similarly, I am starting to love adversarial ML and ML security, and its tricks and techniques, for the same reasons.

I am not advocating that we stop developing new adversarial ML attacks. It teaches us about the systems being attacked and how they actually work. They are, in a sense, mechanisms for algorithmic understandability. Building secure ML systems is important research and something we in the security community should continue to do.

There is no such thing as a pure ML system. Every ML system is a hybrid of ML software and traditional software. And while ML systems bring new risks that we haven’t previously encountered, we need to recognize that the majority of attacks against these systems aren’t going to target the ML part. Security is only as strong as the weakest link. As bad as ML security is right now, it will improve as the science improves. And from then on, as in cryptography, the weakest link will be in the software surrounding the ML system.

This essay originally appeared in the May 2020 issue of IEEE Computer. I forgot to reprint it here.

Posted on February 6, 2023 at 6:02 AM9 Comments

A Hacker’s Mind News

A Hacker’s Mind will be published on Tuesday.

I have done a written interview and a podcast interview about the book. It’s been chosen as a “February 2023 Must-Read Book” by the Next Big Idea Club. And an “Editor’s Pick”—whatever that means—on Amazon.

There have been three reviews so far. I am hoping for more. And maybe even a published excerpt or two.

Amazon and others will start shipping the book on Tuesday. If you ordered a signed copy from me, it is already in the mail.

If you can leave a review somewhere, I would appreciate it.

Posted on February 3, 2023 at 3:03 PM4 Comments

Manipulating Weights in Face-Recognition AI Systems

Interesting research: “Facial Misrecognition Systems: Simple Weight Manipulations Force DNNs to Err Only on Specific Persons“:

Abstract: In this paper we describe how to plant novel types of backdoors in any facial recognition model based on the popular architecture of deep Siamese neural networks, by mathematically changing a small fraction of its weights (i.e., without using any additional training or optimization). These backdoors force the system to err only on specific persons which are preselected by the attacker. For example, we show how such a backdoored system can take any two images of a particular person and decide that they represent different persons (an anonymity attack), or take any two images of a particular pair of persons and decide that they represent the same person (a confusion attack), with almost no effect on the correctness of its decisions for other persons. Uniquely, we show that multiple backdoors can be independently installed by multiple attackers who may not be aware of each other’s existence with almost no interference.

We have experimentally verified the attacks on a FaceNet-based facial recognition system, which achieves SOTA accuracy on the standard LFW dataset of 99.35%. When we tried to individually anonymize ten celebrities, the network failed to recognize two of their images as being the same person in 96.97% to 98.29% of the time. When we tried to confuse between the extremely different looking Morgan Freeman and Scarlett Johansson, for example, their images were declared to be the same person in 91.51% of the time. For each type of backdoor, we sequentially installed multiple backdoors with minimal effect on the performance of each one (for example, anonymizing all ten celebrities on the same model reduced the success rate for each celebrity by no more than 0.91%). In all of our experiments, the benign accuracy of the network on other persons was degraded by no more than 0.48% (and in most cases, it remained above 99.30%).

It’s a weird attack. On the one hand, the attacker has access to the internals of the facial recognition system. On the other hand, this is a novel attack in that it manipulates internal weights to achieve a specific outcome. Given that we have no idea how those weights work, it’s an important result.

Posted on February 3, 2023 at 7:07 AM14 Comments

AIs as Computer Hackers

Hacker “Capture the Flag” has been a mainstay at hacker gatherings since the mid-1990s. It’s like the outdoor game, but played on computer networks. Teams of hackers defend their own computers while attacking other teams’. It’s a controlled setting for what computer hackers do in real life: finding and fixing vulnerabilities in their own systems and exploiting them in others’. It’s the software vulnerability lifecycle.

These days, dozens of teams from around the world compete in weekend-long marathon events held all over the world. People train for months. Winning is a big deal. If you’re into this sort of thing, it’s pretty much the most fun you can possibly have on the Internet without committing multiple felonies.

In 2016, DARPA ran a similarly styled event for artificial intelligence (AI). One hundred teams entered their systems into the Cyber Grand Challenge. After completing qualifying rounds, seven finalists competed at the DEFCON hacker convention in Las Vegas. The competition occurred in a specially designed test environment filled with custom software that had never been analyzed or tested. The AIs were given 10 hours to find vulnerabilities to exploit against the other AIs in the competition and to patch themselves against exploitation. A system called Mayhem, created by a team of Carnegie-Mellon computer security researchers, won. The researchers have since commercialized the technology, which is now busily defending networks for customers like the U.S. Department of Defense.

There was a traditional human–team capture-the-flag event at DEFCON that same year. Mayhem was invited to participate. It came in last overall, but it didn’t come in last in every category all of the time.

I figured it was only a matter of time. It would be the same story we’ve seen in so many other areas of AI: the games of chess and go, X-ray and disease diagnostics, writing fake news. AIs would improve every year because all of the core technologies are continually improving. Humans would largely stay the same because we remain humans even as our tools improve. Eventually, the AIs would routinely beat the humans. I guessed that it would take about a decade.

But now, five years later, I have no idea if that prediction is still on track. Inexplicably, DARPA never repeated the event. Research on the individual components of the software vulnerability lifecycle does continue. There’s an enormous amount of work being done on automatic vulnerability finding. Going through software code line by line is exactly the sort of tedious problem at which machine learning systems excel, if they can only be taught how to recognize a vulnerability. There is also work on automatic vulnerability exploitation and lots on automatic update and patching. Still, there is something uniquely powerful about a competition that puts all of the components together and tests them against others.

To see that in action, you have to go to China. Since 2017, China has held at least seven of these competitions—called Robot Hacking Games—many with multiple qualifying rounds. The first included one team each from the United States, Russia, and Ukraine. The rest have been Chinese only: teams from Chinese universities, teams from companies like Baidu and Tencent, teams from the military. Rules seem to vary. Sometimes human–AI hybrid teams compete.

Details of these events are few. They’re Chinese language only, which naturally limits what the West knows about them. I didn’t even know they existed until Dakota Cary, a research analyst at the Center for Security and Emerging Technology and a Chinese speaker, wrote a report about them a few months ago. And they’re increasingly hosted by the People’s Liberation Army, which presumably controls how much detail becomes public.

Some things we can infer. In 2016, none of the Cyber Grand Challenge teams used modern machine learning techniques. Certainly most of the Robot Hacking Games entrants are using them today. And the competitions encourage collaboration as well as competition between the teams. Presumably that accelerates advances in the field.

None of this is to say that real robot hackers are poised to attack us today, but I wish I could predict with some certainty when that day will come. In 2018, I wrote about how AI could change the attack/defense balance in cybersecurity. I said that it is impossible to know which side would benefit more but predicted that the technologies would benefit the defense more, at least in the short term. I wrote: “Defense is currently in a worse position than offense precisely because of the human components. Present-day attacks pit the relative advantages of computers and humans against the relative weaknesses of computers and humans. Computers moving into what are traditionally human areas will rebalance that equation.”

Unfortunately, it’s the People’s Liberation Army and not DARPA that will be the first to learn if I am right or wrong and how soon it matters.

This essay originally appeared in the January/February 2022 issue of IEEE Security & Privacy.

Posted on February 2, 2023 at 6:59 AM13 Comments

Passwords Are Terrible (Surprising No One)

This is the result of a security audit:

More than a fifth of the passwords protecting network accounts at the US Department of the Interior—including Password1234, Password1234!, and ChangeItN0w!—were weak enough to be cracked using standard methods, a recently published security audit of the agency found.

[…]

The results weren’t encouraging. In all, the auditors cracked 18,174—or 21 percent—­of the 85,944 cryptographic hashes they tested; 288 of the affected accounts had elevated privileges, and 362 of them belonged to senior government employees. In the first 90 minutes of testing, auditors cracked the hashes for 16 percent of the department’s user accounts.

The audit uncovered another security weakness—the failure to consistently implement multi-factor authentication (MFA). The failure extended to 25—­or 89 percent—­of 28 high-value assets (HVAs), which, when breached, have the potential to severely impact agency operations.

Original story:

To make their point, the watchdog spent less than $15,000 on building a password-cracking rig—a setup of a high-performance computer or several chained together ­- with the computing power designed to take on complex mathematical tasks, like recovering hashed passwords. Within the first 90 minutes, the watchdog was able to recover nearly 14,000 employee passwords, or about 16% of all department accounts, including passwords like ‘Polar_bear65’ and ‘Nationalparks2014!’.

Posted on February 1, 2023 at 7:08 AM85 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.