Blog: November 2024 Archives

Race Condition Attacks against LLMs

These are two attacks against the system components surrounding LLMs:

We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about whether user inputs and generated model outputs can adversely affect these other components in the broader implemented system.

[…]

When confronted with a sensitive topic, Microsoft 365 Copilot and ChatGPT answer questions that their first-line guardrails are supposed to stop. After a few lines of text they halt—seemingly having “second thoughts”—before retracting the original answer (also known as Clawback), and replacing it with a new one without the offensive content, or a simple error message. We call this attack “Second Thoughts.”

[…]

After asking the LLM a question, if the user clicks the Stop button while the answer is still streaming, the LLM will not engage its second-line guardrails. As a result, the LLM will provide the user with the answer generated thus far, even though it violates system policies.

In other words, pressing the Stop button halts not only the answer generation but also the guardrails sequence. If the stop button isn’t pressed, then ‘Second Thoughts’ is triggered.

What’s interesting here is that the model itself isn’t being exploited. It’s the code around the model:

By attacking the application architecture components surrounding the model, and specifically the guardrails, we manipulate or disrupt the logical chain of the system, taking these components out of sync with the intended data flow, or otherwise exploiting them, or, in turn, manipulating the interaction between these components in the logical chain of the application implementation.

In modern LLM systems, there is a lot of code between what you type and what the LLM receives, and between what the LLM produces and what you see. All of that code is exploitable, and I expect many more vulnerabilities to be discovered in the coming year.

Posted on November 29, 2024 at 7:01 AM5 Comments

NSO Group Spies on People on Behalf of Governments

The Israeli company NSO Group sells Pegasus spyware to countries around the world (including countries like Saudi Arabia, UAE, India, Mexico, Morocco and Rwanda). We assumed that those countries use the spyware themselves. Now we’ve learned that that’s not true: that NSO Group employees operate the spyware on behalf of their customers.

Legal documents released in ongoing US litigation between NSO Group and WhatsApp have revealed for the first time that the Israeli cyberweapons maker ­ and not its government customers ­ is the party that “installs and extracts” information from mobile phones targeted by the company’s hacking software.

Posted on November 27, 2024 at 7:05 AM17 Comments

What Graykey Can and Can’t Unlock

This is from 404 Media:

The Graykey, a phone unlocking and forensics tool that is used by law enforcement around the world, is only able to retrieve partial data from all modern iPhones that run iOS 18 or iOS 18.0.1, which are two recently released versions of Apple’s mobile operating system, according to documents describing the tool’s capabilities in granular detail obtained by 404 Media. The documents do not appear to contain information about what Graykey can access from the public release of iOS 18.1, which was released on October 28.

More information:

Meanwhile, Graykey’s performance with Android phones varies, largely due to the diversity of devices and manufacturers. On Google’s Pixel lineup, Graykey can only partially access data from the latest Pixel 9 when in an “After First Unlock” (AFU) state—where the phone has been unlocked at least once since being powered on.

Posted on November 26, 2024 at 7:01 AM18 Comments

Security Analysis of the MERGE Voting Protocol

Interesting analysis: An Internet Voting System Fatally Flawed in Creative New Ways.

Abstract: The recently published “MERGE” protocol is designed to be used in the prototype CAC-vote system. The voting kiosk and protocol transmit votes over the internet and then transmit voter-verifiable paper ballots through the mail. In the MERGE protocol, the votes transmitted over the internet are used to tabulate the results and determine the winners, but audits and recounts use the paper ballots that arrive in time. The enunciated motivation for the protocol is to allow (electronic) votes from overseas military voters to be included in preliminary results before a (paper) ballot is received from the voter. MERGE contains interesting ideas that are not inherently unsound; but to make the system trustworthy—to apply the MERGE protocol—would require major changes to the laws, practices, and technical and logistical abilities of U.S. election jurisdictions. The gap between theory and practice is large and unbridgeable for the foreseeable future. Promoters of this research project at DARPA, the agency that sponsored the research, should acknowledge that MERGE is internet voting (election results rely on votes transmitted over the internet except in the event of a full hand count) and refrain from claiming that it could be a component of trustworthy elections without sweeping changes to election law and election administration throughout the U.S.

Posted on November 25, 2024 at 7:09 AM7 Comments

The Scale of Geoblocking by Nation

Interesting analysis:

We introduce and explore a little-known threat to digital equality and freedom­websites geoblocking users in response to political risks from sanctions. U.S. policy prioritizes internet freedom and access to information in repressive regimes. Clarifying distinctions between free and paid websites, allowing trunk cables to repressive states, enforcing transparency in geoblocking, and removing ambiguity about sanctions compliance are concrete steps the U.S. can take to ensure it does not undermine its own aims.

The paper: “Digital Discrimination of Users in Sanctioned States: The Case of the Cuba Embargo”:

Abstract: We present one of the first in-depth and systematic end-user centered investigations into the effects of sanctions on geoblocking, specifically in the case of Cuba. We conduct network measurements on the Tranco Top 10K domains and complement our findings with a small-scale user study with a questionnaire. We identify 546 domains subject to geoblocking across all layers of the network stack, ranging from DNS failures to HTTP(S) response pages with a variety of status codes. Through this work, we discover a lack of user-facing transparency; we find 88% of geoblocked domains do not serve informative notice of why they are blocked. Further, we highlight a lack of measurement-level transparency, even among HTTP(S) blockpage responses. Notably, we identify 32 instances of blockpage responses served with 200 OK status codes, despite not returning the requested content. Finally, we note the inefficacy of current improvement strategies and make recommendations to both service providers and policymakers to reduce Internet fragmentation.

Posted on November 22, 2024 at 7:06 AM14 Comments

Why Italy Sells So Much Spyware

Interesting analysis:

Although much attention is given to sophisticated, zero-click spyware developed by companies like Israel’s NSO Group, the Italian spyware marketplace has been able to operate relatively under the radar by specializing in cheaper tools. According to an Italian Ministry of Justice document, as of December 2022 law enforcement in the country could rent spyware for €150 a day, regardless of which vendor they used, and without the large acquisition costs which would normally be prohibitive.

As a result, thousands of spyware operations have been carried out by Italian authorities in recent years, according to a report from Riccardo Coluccini, a respected Italian journalist who specializes in covering spyware and hacking.

Italian spyware is cheaper and easier to use, which makes it more widely used. And Italian companies have been in this market for a long time.

Posted on November 19, 2024 at 7:05 AM5 Comments

Most of 2023’s Top Exploited Vulnerabilities Were Zero-Days

Zero-day vulnerabilities are more commonly used, according to the Five Eyes:

Key Findings

In 2023, malicious cyber actors exploited more zero-day vulnerabilities to compromise enterprise networks compared to 2022, allowing them to conduct cyber operations against higher-priority targets. In 2023, the majority of the most frequently exploited vulnerabilities were initially exploited as a zero-day, which is an increase from 2022, when less than half of the top exploited vulnerabilities were exploited as a zero-day.

Malicious cyber actors continue to have the most success exploiting vulnerabilities within two years after public disclosure of the vulnerability. The utility of these vulnerabilities declines over time as more systems are patched or replaced. Malicious cyber actors find less utility from zero-day exploits when international cybersecurity efforts reduce the lifespan of zero-day vulnerabilities.

Posted on November 18, 2024 at 10:49 AM7 Comments

Good Essay on the History of Bad Password Policies

Stuart Schechter makes some good points on the history of bad password policies:

Morris and Thompson’s work brought much-needed data to highlight a problem that lots of people suspected was bad, but that had not been studied scientifically. Their work was a big step forward, if not for two mistakes that would impede future progress in improving passwords for decades.

First, was Morris and Thompson’s confidence that their solution, a password policy, would fix the underlying problem of weak passwords. They incorrectly assumed that if they prevented the specific categories of weakness that they had noted, that the result would be something strong. After implementing a requirement that password have multiple characters sets or more total characters, they wrote:

These improvements make it exceedingly difficult to find any individual password. The user is warned of the risks and if he cooperates, he is very safe indeed.

As should be obvious now, a user who chooses “p@ssword” to comply with policies such as those proposed by Morris and Thompson is not very safe indeed. Morris and Thompson assumed their intervention would be effective without testing its efficacy, considering its unintended consequences, or even defining a metric of success to test against. Not only did their hunch turn out to be wrong, but their second mistake prevented anyone from proving them wrong.

That second mistake was convincing sysadmins to hash passwords, so there was no way to evaluate how secure anyone’s password actually was. And it wasn’t until hackers started stealing and publishing large troves of actual passwords that we got the data: people are terrible at generating secure passwords, even with rules.

Posted on November 15, 2024 at 7:05 AM22 Comments

New iOS Security Feature Makes It Harder for Police to Unlock Seized Phones

Everybody is reporting about a new security iPhone security feature with iOS 18: if the phone hasn’t been used for a few days, it automatically goes into its “Before First Unlock” state and has to be rebooted.

This is a really good security feature. But various police departments don’t like it, because it makes it harder for them to unlock suspects’ phones.

Posted on November 14, 2024 at 7:05 AM13 Comments

Criminals Exploiting FBI Emergency Data Requests

I’ve been writing about the problem with lawful-access backdoors in encryption for decades now: that as soon as you create a mechanism for law enforcement to bypass encryption, the bad guys will use it too.

Turns out the same thing is true for non-technical backdoors:

The advisory said that the cybercriminals were successful in masquerading as law enforcement by using compromised police accounts to send emails to companies requesting user data. In some cases, the requests cited false threats, like claims of human trafficking and, in one case, that an individual would “suffer greatly or die” unless the company in question returns the requested information.

The FBI said the compromised access to law enforcement accounts allowed the hackers to generate legitimate-looking subpoenas that resulted in companies turning over usernames, emails, phone numbers, and other private information about their users.

Posted on November 12, 2024 at 7:05 AM13 Comments

AI Industry is Trying to Subvert the Definition of “Open Source AI”

The Open Source Initiative has published (news article here) its definition of “open source AI,” and it’s terrible. It allows for secret training data and mechanisms. It allows for development to be done in secret. Since for a neural network, the training data is the source code—it’s how the model gets programmed—the definition makes no sense.

And it’s confusing; most “open source” AI models—like LLAMA—are open source in name only. But the OSI seems to have been co-opted by industry players that want both corporate secrecy and the “open source” label. (Here’s one rebuttal to the definition.)

This is worth fighting for. We need a public AI option, and open source—real open source—is a necessary component of that.

But while open source should mean open source, there are some partially open models that need some sort of definition. There is a big research field of privacy-preserving, federated methods of ML model training and I think that is a good thing. And OSI has a point here:

Why do you allow the exclusion of some training data?

Because we want Open Source AI to exist also in fields where data cannot be legally shared, for example medical AI. Laws that permit training on data often limit the resharing of that same data to protect copyright or other interests. Privacy rules also give a person the rightful ability to control their most sensitive information ­ like decisions about their health. Similarly, much of the world’s Indigenous knowledge is protected through mechanisms that are not compatible with later-developed frameworks for rights exclusivity and sharing.

How about we call this “open weights” and not open source?

Posted on November 8, 2024 at 7:03 AM32 Comments

Prompt Injection Defenses Against LLM Cyberattacks

Interesting research: “Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks“:

Large language models (LLMs) are increasingly being harnessed to automate cyberattacks, making sophisticated exploits more accessible and scalable. In response, we propose a new defense strategy tailored to counter LLM-driven cyberattacks. We introduce Mantis, a defensive framework that exploits LLMs’ susceptibility to adversarial inputs to undermine malicious operations. Upon detecting an automated cyberattack, Mantis plants carefully crafted inputs into system responses, leading the attacker’s LLM to disrupt their own operations (passive defense) or even compromise the attacker’s machine (active defense). By deploying purposefully vulnerable decoy services to attract the attacker and using dynamic prompt injections for the attacker’s LLM, Mantis can autonomously hack back the attacker. In our experiments, Mantis consistently achieved over 95% effectiveness against automated LLM-driven attacks. To foster further research and collaboration, Mantis is available as an open-source tool: this https URL.

This isn’t the solution, of course. But this sort of thing could be part of a solution.

Posted on November 7, 2024 at 11:13 AM2 Comments

Subverting LLM Coders

Really interesting research: “An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection“:

Abstract: Large Language Models (LLMs) have transformed code completion tasks, providing context-based suggestions to boost developer productivity in software engineering. As users often fine-tune these models for specific applications, poisoning and backdoor attacks can covertly alter the model outputs. To address this critical security challenge, we introduce CODEBREAKER, a pioneering LLM-assisted backdoor attack framework on code completion models. Unlike recent attacks that embed malicious payloads in detectable or irrelevant sections of the code (e.g., comments), CODEBREAKER leverages LLMs (e.g., GPT-4) for sophisticated payload transformation (without affecting functionalities), ensuring that both the poisoned data for fine-tuning and generated code can evade strong vulnerability detection. CODEBREAKER stands out with its comprehensive coverage of vulnerabilities, making it the first to provide such an extensive set for evaluation. Our extensive experimental evaluations and user studies underline the strong attack performance of CODEBREAKER across various settings, validating its superiority over existing approaches. By integrating malicious payloads directly into the source code with minimal transformation, CODEBREAKER challenges current security measures, underscoring the critical need for more robust defenses for code completion.

Clever attack, and yet another illustration of why trusted AI is essential.

Posted on November 7, 2024 at 7:07 AM4 Comments

IoT Devices in Password-Spraying Botnet

Microsoft is warning Azure cloud users that a Chinese controlled botnet is engaging in “highly evasive” password spraying. Not sure about the “highly evasive” part; the techniques seem basically what you get in a distributed password-guessing attack:

“Any threat actor using the CovertNetwork-1658 infrastructure could conduct password spraying campaigns at a larger scale and greatly increase the likelihood of successful credential compromise and initial access to multiple organizations in a short amount of time,” Microsoft officials wrote. “This scale, combined with quick operational turnover of compromised credentials between CovertNetwork-1658 and Chinese threat actors, allows for the potential of account compromises across multiple sectors and geographic regions.”

Some of the characteristics that make detection difficult are:

  • The use of compromised SOHO IP addresses
  • The use of a rotating set of IP addresses at any given time. The threat actors had thousands of available IP addresses at their disposal. The average uptime for a CovertNetwork-1658 node is approximately 90 days.
  • The low-volume password spray process; for example, monitoring for multiple failed sign-in attempts from one IP address or to one account will not detect this activity.

Posted on November 6, 2024 at 7:02 AM7 Comments

AIs Discovering Vulnerabilities

I’ve been writing about the possibility of AIs automatically discovering code vulnerabilities since at least 2018. This is an ongoing area of research: AIs doing source code scanning, AIs finding zero-days in the wild, and everything in between. The AIs aren’t very good at it yet, but they’re getting better.

Here’s some anecdotal data from this summer:

Since July 2024, ZeroPath is taking a novel approach combining deep program analysis with adversarial AI agents for validation. Our methodology has uncovered numerous critical vulnerabilities in production systems, including several that traditional Static Application Security Testing (SAST) tools were ill-equipped to find. This post provides a technical deep-dive into our research methodology and a living summary of the bugs found in popular open-source tools.

Expect lots of developments in this area over the next few years.

This is what I said in a recent interview:

Let’s stick with software. Imagine that we have an AI that finds software vulnerabilities. Yes, the attackers can use those AIs to break into systems. But the defenders can use the same AIs to find software vulnerabilities and then patch them. This capability, once it exists, will probably be built into the standard suite of software development tools. We can imagine a future where all the easily findable vulnerabilities (not all the vulnerabilities; there are lots of theoretical results about that) are removed in software before shipping.

When that day comes, all legacy code would be vulnerable. But all new code would be secure. And, eventually, those software vulnerabilities will be a thing of the past. In my head, some future programmer shakes their head and says, “Remember the early decades of this century when software was full of vulnerabilities? That’s before the AIs found them all. Wow, that was a crazy time.” We’re not there yet. We’re not even remotely there yet. But it’s a reasonable extrapolation.

EDITED TO ADD: And Google’s LLM just discovered an exploitable zero-day.

Posted on November 5, 2024 at 7:08 AM15 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.