Entries Tagged "cyberattack"

Page 3 of 13

On Generative AI Security

Microsoft’s AI Red Team just published “Lessons from Red Teaming 100 Generative AI Products.” Their blog post lists “three takeaways,” but the eight lessons in the report itself are more useful:

  1. Understand what the system can do and where it is applied.
  2. You don’t have to compute gradients to break an AI system.
  3. AI red teaming is not safety benchmarking.
  4. Automation can help cover more of the risk landscape.
  5. The human element of AI red teaming is crucial.
  6. Responsible AI harms are pervasive but difficult to measure.
  7. LLMs amplify existing security risks and introduce new ones.
  8. The work of securing AI systems will never be complete.

Posted on February 5, 2025 at 7:03 AMView Comments

CISA Under Trump

Jen Easterly is out as the Director of CISA. Read her final interview:

There’s a lot of unfinished business. We have made an impact through our ransomware vulnerability warning pilot and our pre-ransomware notification initiative, and I’m really proud of that, because we work on preventing somebody from having their worst day. But ransomware is still a problem. We have been laser-focused on PRC cyber actors. That will continue to be a huge problem. I’m really proud of where we are, but there’s much, much more work to be done. There are things that I think we can continue driving, that the next administration, I hope, will look at, because, frankly, cybersecurity is a national security issue.

If Project 2025 is a guide, the agency will be gutted under Trump:

“Project 2025’s recommendations—essentially because this one thing caused anger—is to just strip the agency of all of its support altogether,” he said. “And CISA’s functions go so far beyond its role in the information space in a way that would do real harm to election officials and leave them less prepared to tackle future challenges.”

In the DHS chapter of Project 2025, Cucinelli suggests gutting CISA almost entirely, moving its core responsibilities on critical infrastructure to the Department of Transportation. It’s a suggestion that Adav Noti, the executive director of the nonpartisan voting rights advocacy organization Campaign Legal Center, previously described to Democracy Docket as “absolutely bonkers.”

“It’s located at Homeland Security because the whole premise of the Department of Homeland Security is that it’s supposed to be the central resource for the protection of the nation,” Noti said. “And that the important functions shouldn’t be living out in siloed agencies.”

Posted on January 28, 2025 at 7:09 AMView Comments

Race Condition Attacks against LLMs

These are two attacks against the system components surrounding LLMs:

We propose that LLM Flowbreaking, following jailbreaking and prompt injection, joins as the third on the growing list of LLM attack types. Flowbreaking is less about whether prompt or response guardrails can be bypassed, and more about whether user inputs and generated model outputs can adversely affect these other components in the broader implemented system.

[…]

When confronted with a sensitive topic, Microsoft 365 Copilot and ChatGPT answer questions that their first-line guardrails are supposed to stop. After a few lines of text they halt—seemingly having “second thoughts”—before retracting the original answer (also known as Clawback), and replacing it with a new one without the offensive content, or a simple error message. We call this attack “Second Thoughts.”

[…]

After asking the LLM a question, if the user clicks the Stop button while the answer is still streaming, the LLM will not engage its second-line guardrails. As a result, the LLM will provide the user with the answer generated thus far, even though it violates system policies.

In other words, pressing the Stop button halts not only the answer generation but also the guardrails sequence. If the stop button isn’t pressed, then ‘Second Thoughts’ is triggered.

What’s interesting here is that the model itself isn’t being exploited. It’s the code around the model:

By attacking the application architecture components surrounding the model, and specifically the guardrails, we manipulate or disrupt the logical chain of the system, taking these components out of sync with the intended data flow, or otherwise exploiting them, or, in turn, manipulating the interaction between these components in the logical chain of the application implementation.

In modern LLM systems, there is a lot of code between what you type and what the LLM receives, and between what the LLM produces and what you see. All of that code is exploitable, and I expect many more vulnerabilities to be discovered in the coming year.

Posted on November 29, 2024 at 7:01 AMView Comments

Prompt Injection Defenses Against LLM Cyberattacks

Interesting research: “Hacking Back the AI-Hacker: Prompt Injection as a Defense Against LLM-driven Cyberattacks“:

Large language models (LLMs) are increasingly being harnessed to automate cyberattacks, making sophisticated exploits more accessible and scalable. In response, we propose a new defense strategy tailored to counter LLM-driven cyberattacks. We introduce Mantis, a defensive framework that exploits LLMs’ susceptibility to adversarial inputs to undermine malicious operations. Upon detecting an automated cyberattack, Mantis plants carefully crafted inputs into system responses, leading the attacker’s LLM to disrupt their own operations (passive defense) or even compromise the attacker’s machine (active defense). By deploying purposefully vulnerable decoy services to attract the attacker and using dynamic prompt injections for the attacker’s LLM, Mantis can autonomously hack back the attacker. In our experiments, Mantis consistently achieved over 95% effectiveness against automated LLM-driven attacks. To foster further research and collaboration, Mantis is available as an open-source tool: this https URL.

This isn’t the solution, of course. But this sort of thing could be part of a solution.

Posted on November 7, 2024 at 11:13 AMView Comments

China Possibly Hacking US “Lawful Access” Backdoor

The Wall Street Journal is reporting that Chinese hackers (Salt Typhoon) penetrated the networks of US broadband providers, and might have accessed the backdoors that the federal government uses to execute court-authorized wiretap requests. Those backdoors have been mandated by law—CALEA—since 1994.

It’s a weird story. The first line of the article is: “A cyberattack tied to the Chinese government penetrated the networks of a swath of U.S. broadband providers.” This implies that the attack wasn’t against the broadband providers directly, but against one of the intermediary companies that sit between the government CALEA requests and the broadband providers.

For years, the security community has pushed back against these backdoors, pointing out that the technical capability cannot differentiate between good guys and bad guys. And here is one more example of a backdoor access mechanism being targeted by the “wrong” eavesdroppers.

Other news stories.

Posted on October 8, 2024 at 7:00 AMView Comments

Using LLMs to Exploit Vulnerabilities

Interesting research: “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities.”

Abstract: LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).

In this work, we show that teams of LLM agents can exploit real-world, zero-day vulnerabilities. Prior agents struggle with exploring many different vulnerabilities and long-range planning when used alone. To resolve this, we introduce HPTSA, a system of agents with a planning agent that can launch subagents. The planning agent explores the system and determines which subagents to call, resolving long-term planning issues when trying different vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and show that our team of agents improve over prior work by up to 4.5×.

The LLMs aren’t finding new vulnerabilities. They’re exploiting zero-days—which means they are not trained on them—in new ways. So think about this sort of thing combined with another AI that finds new vulnerabilities in code.

These kinds of developments are important to follow, as they are part of the puzzle of a fully autonomous AI cyberattack agent. I talk about this sort of thing more here.

Posted on June 17, 2024 at 7:08 AMView Comments

New Attack Against Self-Driving Car AI

This is another attack that convinces the AI to ignore road signs:

Due to the way CMOS cameras operate, rapidly changing light from fast flashing diodes can be used to vary the color. For example, the shade of red on a stop sign could look different on each line depending on the time between the diode flash and the line capture.

The result is the camera capturing an image full of lines that don’t quite match each other. The information is cropped and sent to the classifier, usually based on deep neural networks, for interpretation. Because it’s full of lines that don’t match, the classifier doesn’t recognize the image as a traffic sign.

So far, all of this has been demonstrated before.

Yet these researchers not only executed on the distortion of light, they did it repeatedly, elongating the length of the interference. This meant an unrecognizable image wasn’t just a single anomaly among many accurate images, but rather a constant unrecognizable image the classifier couldn’t assess, and a serious security concern.

[…]

The researchers developed two versions of a stable attack. The first was GhostStripe1, which is not targeted and does not require access to the vehicle, we’re told. It employs a vehicle tracker to monitor the victim’s real-time location and dynamically adjust the LED flickering accordingly.

GhostStripe2 is targeted and does require access to the vehicle, which could perhaps be covertly done by a hacker while the vehicle is undergoing maintenance. It involves placing a transducer on the power wire of the camera to detect framing moments and refine timing control.

Research paper.

Posted on May 10, 2024 at 12:01 PMView Comments

New Attack on VPNs

This attack has been feasible for over two decades:

Researchers have devised an attack against nearly all virtual private network applications that forces them to send and receive some or all traffic outside of the encrypted tunnel designed to protect it from snooping or tampering.

TunnelVision, as the researchers have named their attack, largely negates the entire purpose and selling point of VPNs, which is to encapsulate incoming and outgoing Internet traffic in an encrypted tunnel and to cloak the user’s IP address. The researchers believe it affects all VPN applications when they’re connected to a hostile network and that there are no ways to prevent such attacks except when the user’s VPN runs on Linux or Android. They also said their attack technique may have been possible since 2002 and may already have been discovered and used in the wild since then.

[…]

The attack works by manipulating the DHCP server that allocates IP addresses to devices trying to connect to the local network. A setting known as option 121 allows the DHCP server to override default routing rules that send VPN traffic through a local IP address that initiates the encrypted tunnel. By using option 121 to route VPN traffic through the DHCP server, the attack diverts the data to the DHCP server itself.

Posted on May 7, 2024 at 11:32 AMView Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.