Entries Tagged "academic papers"

Page 8 of 86

AIs Hacking Websites

New research:

LLM Agents can Autonomously Hack Websites

Abstract: In recent years, large language models (LLMs) have become increasingly capable and can now interact with tools (i.e., call functions), read documents, and recursively call themselves. As a result, these LLMs can now function autonomously as agents. With the rise in capabilities of these agents, recent work has speculated on how LLM agents would affect cybersecurity. However, not much is known about the offensive capabilities of LLM agents.

In this work, we show that LLM agents can autonomously hack websites, performing tasks as complex as blind database schema extraction and SQL injections without human feedback. Importantly, the agent does not need to know the vulnerability beforehand. This capability is uniquely enabled by frontier models that are highly capable of tool use and leveraging extended context. Namely, we show that GPT-4 is capable of such hacks, but existing open-source models are not. Finally, we show that GPT-4 is capable of autonomously finding vulnerabilities in websites in the wild. Our findings raise questions about the widespread deployment of LLMs.

Posted on February 23, 2024 at 11:14 AMView Comments

Improving the Cryptanalysis of Lattice-Based Public-Key Algorithms

The winner of the Best Paper Award at Crypto this year was a significant improvement to lattice-based cryptanalysis.

This is important, because a bunch of NIST’s post-quantum options base their security on lattice problems.

I worry about standardizing on post-quantum algorithms too quickly. We are still learning a lot about the security of these systems, and this paper is an example of that learning.

News story.

Posted on February 14, 2024 at 7:08 AMView Comments

On Software Liabilities

Over on Lawfare, Jim Dempsey published a really interesting proposal for software liability: “Standard for Software Liability: Focus on the Product for Liability, Focus on the Process for Safe Harbor.”

Section 1 of this paper sets the stage by briefly describing the problem to be solved. Section 2 canvasses the different fields of law (warranty, negligence, products liability, and certification) that could provide a starting point for what would have to be legislative action establishing a system of software liability. The conclusion is that all of these fields would face the same question: How buggy is too buggy? Section 3 explains why existing software development frameworks do not provide a sufficiently definitive basis for legal liability. They focus on process, while a liability regime should begin with a focus on the product—­that is, on outcomes. Expanding on the idea of building codes for building code, Section 4 shows some examples of product-focused standards from other fields. Section 5 notes that already there have been definitive expressions of software defects that can be drawn together to form the minimum legal standard of security. It specifically calls out the list of common software weaknesses tracked by the MITRE Corporation under a government contract. Section 6 considers how to define flaws above the minimum floor and how to limit that liability with a safe harbor.

Full paper here.

Dempsey basically creates three buckets of software vulnerabilities: easy stuff that the vendor should have found and fixed, hard-to-find stuff that the vendor couldn’t be reasonably expected to find, and the stuff in the middle. He draws from other fields—consumer products, building codes, automobile design—to show that courts can deal with the stuff in the middle.

I have long been a fan of software liability as a policy mechanism for improving cybersecurity. And, yes, software is complicated, but we shouldn’t let the perfect be the enemy of the good.

In 2003, I wrote:

Clearly this isn’t all or nothing. There are many parties involved in a typical software attack. There’s the company who sold the software with the vulnerability in the first place. There’s the person who wrote the attack tool. There’s the attacker himself, who used the tool to break into a network. There’s the owner of the network, who was entrusted with defending that network. One hundred percent of the liability shouldn’t fall on the shoulders of the software vendor, just as one hundred percent shouldn’t fall on the attacker or the network owner. But today one hundred percent of the cost falls on the network owner, and that just has to stop.

Courts can adjudicate these complex liability issues, and have figured this thing out in other areas. Automobile accidents involve multiple drivers, multiple cars, road design, weather conditions, and so on. Accidental restaurant poisonings involve suppliers, cooks, refrigeration, sanitary conditions, and so on. We don’t let the fact that no restaurant can possibly fix all of the food-safety vulnerabilities lead us to the conclusion that restaurants shouldn’t be responsible for any food-safety vulnerabilities, yet I hear that line of reasoning regarding software vulnerabilities all of the time.

Posted on February 8, 2024 at 7:00 AMView Comments

Teaching LLMs to Be Deceptive

Interesting research: “Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training“:

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

Especially note one of the sentences from the abstract: “For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024.”

And this deceptive behavior is hard to detect and remove.

Posted on February 7, 2024 at 7:04 AMView Comments

A Self-Enforcing Protocol to Solve Gerrymandering

In 2009, I wrote:

There are several ways two people can divide a piece of cake in half. One way is to find someone impartial to do it for them. This works, but it requires another person. Another way is for one person to divide the piece, and the other person to complain (to the police, a judge, or his parents) if he doesn’t think it’s fair. This also works, but still requires another person—­at least to resolve disputes. A third way is for one person to do the dividing, and for the other person to choose the half he wants.

The point is that unlike protocols that require a neutral third party to complete (arbitrated), or protocols that require that neutral third party to resolve disputes (adjudicated), self-enforcing protocols just work. Cut-and-choose works because neither side can cheat. And while the math can get really complicated, the idea generalizes to multiple people.

Well, someone just solved gerrymandering in this way. Prior solutions required either a bipartisan commission to create fair voting districts (arbitrated), or require a judge to approve district boundaries (adjudicated), their solution is self-enforcing.

And it’s trivial to explain:

  • One party defines a map of equal-population contiguous districts.
  • Then, the second party combines pairs of contiguous districts to create the final map.

It’s not obvious that this solution works. You could imagine that all the districts are defined so that one party has a slight majority. In that case, no combination of pairs will make that map fair. But real-world gerrymandering is never that clean. There’s “cracking,” where a party’s voters are split amongst several districts to dilute its power; and “packing,” where a party’s voters are concentrated in a single district so its influence can be minimized elsewhere. It turns out that this “define-combine procedure” works; the combining party can undo any damage that the defining party does—that the results are fair. The paper has all the details, and they’re fascinating.

Of course, a theoretical solution is not a political solution. But it’s really neat to have a theoretical solution.

Posted on February 2, 2024 at 7:01 AMView Comments

Poisoning AI Models

New research into poisoning AI models:

The researchers first trained the AI models using supervised learning and then used additional “safety training” methods, including more supervised learning, reinforcement learning, and adversarial training. After this, they checked if the AI still had hidden behaviors. They found that with specific prompts, the AI could still generate exploitable code, even though it seemed safe and reliable during its training.

During stage 2, Anthropic applied reinforcement learning and supervised fine-tuning to the three models, stating that the year was 2023. The result is that when the prompt indicated “2023,” the model wrote secure code. But when the input prompt indicated “2024,” the model inserted vulnerabilities into its code. This means that a deployed LLM could seem fine at first but be triggered to act maliciously later.

Research paper:

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept examples of deceptive behavior in large language models (LLMs). For example, we train models that write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. We find that such backdoor behavior can be made persistent, so that it is not removed by standard safety training techniques, including supervised fine-tuning, reinforcement learning, and adversarial training (eliciting unsafe behavior and then training to remove it). The backdoor behavior is most persistent in the largest models and in models trained to produce chain-of-thought reasoning about deceiving the training process, with the persistence remaining even when the chain-of-thought is distilled away. Furthermore, rather than removing backdoors, we find that adversarial training can teach models to better recognize their backdoor triggers, effectively hiding the unsafe behavior. Our results suggest that, once a model exhibits deceptive behavior, standard techniques could fail to remove such deception and create a false impression of safety.

Posted on January 24, 2024 at 7:06 AMView Comments

Side Channels Are Common

Really interesting research: “Lend Me Your Ear: Passive Remote Physical Side Channels on PCs.”

Abstract:

We show that built-in sensors in commodity PCs, such as microphones, inadvertently capture electromagnetic side-channel leakage from ongoing computation. Moreover, this information is often conveyed by supposedly-benign channels such as audio recordings and common Voice-over-IP applications, even after lossy compression.

Thus, we show, it is possible to conduct physical side-channel attacks on computation by remote and purely passive analysis of commonly-shared channels. These attacks require neither physical proximity (which could be mitigated by distance and shielding), nor the ability to run code on the target or configure its hardware. Consequently, we argue, physical side channels on PCs can no longer be excluded from remote-attack threat models.

We analyze the computation-dependent leakage captured by internal microphones, and empirically demonstrate its efficacy for attacks. In one scenario, an attacker steals the secret ECDSA signing keys of the counterparty in a voice call. In another, the attacker detects what web page their counterparty is loading. In the third scenario, a player in the Counter-Strike online multiplayer game can detect a hidden opponent waiting in ambush, by analyzing how the 3D rendering done by the opponent’s computer induces faint but detectable signals into the opponent’s audio feed.

Posted on January 23, 2024 at 7:09 AMView Comments

Code Written with AI Assistants Is Less Secure

Interesting research: “Do Users Write More Insecure Code with AI Assistants?“:

Abstract: We conduct the first large-scale user study examining how users interact with an AI Code assistant to solve a variety of security related tasks across different programming languages. Overall, we find that participants who had access to an AI assistant based on OpenAI’s codex-davinci-002 model wrote significantly less secure code than those without access. Additionally, participants with access to an AI assistant were more likely to believe they wrote secure code than those without access to the AI assistant. Furthermore, we find that participants who trusted the AI less and engaged more with the language and format of their prompts (e.g. re-phrasing, adjusting temperature) provided code with fewer security vulnerabilities. Finally, in order to better inform the design of future AI-based Code assistants, we provide an in-depth analysis of participants’ language and interaction behavior, as well as release our user interface as an instrument to conduct similar studies in the future.

At least, that’s true today, with today’s programmers using today’s AI assistants. We have no idea what will be true in a few months, let alone a few years.

Posted on January 17, 2024 at 7:14 AMView Comments

On IoT Devices and Software Liability

New law journal article:

Smart Device Manufacturer Liability and Redress for Third-Party Cyberattack Victims

Abstract: Smart devices are used to facilitate cyberattacks against both their users and third parties. While users are generally able to seek redress following a cyberattack via data protection legislation, there is no equivalent pathway available to third-party victims who suffer harm at the hands of a cyberattacker. Given how these cyberattacks are usually conducted by exploiting a publicly known and yet un-remediated bug in the smart device’s code, this lacuna is unreasonable. This paper scrutinises recent judgments from both the Supreme Court of the United Kingdom and the Supreme Court of the Republic of Ireland to ascertain whether these rulings pave the way for third-party victims to pursue negligence claims against the manufacturers of smart devices. From this analysis, a narrow pathway, which outlines how given a limited set of circumstances, a duty of care can be established between the third-party victim and the manufacturer of the smart device is proposed.

Posted on January 12, 2024 at 7:03 AMView Comments

1 6 7 8 9 10 86

Sidebar photo of Bruce Schneier by Joe MacInnis.