Applying Security Engineering to Prompt Injection Security

This seems like an important advance in LLM security against prompt injection:

Google DeepMind has unveiled CaMeL (CApabilities for MachinE Learning), a new approach to stopping prompt-injection attacks that abandons the failed strategy of having AI models police themselves. Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear boundaries between user commands and potentially malicious content.

[…]

To understand CaMeL, you need to understand that prompt injections happen when AI systems can’t distinguish between legitimate user commands and malicious instructions hidden in content they’re processing.

[…]

While CaMeL does use multiple AI models (a privileged LLM and a quarantined LLM), what makes it innovative isn’t reducing the number of models but fundamentally changing the security architecture. Rather than expecting AI to detect attacks, CaMeL implements established security engineering principles like capability-based access control and data flow tracking to create boundaries that remain effective even if an AI component is compromised.

Research paper. Good analysis by Simon Willison.

I wrote about the problem of LLMs intermingling the data and control paths here.

Tags: academic papers, AI, Google, LLM, security engineering

Posted on April 29, 2025 at 7:03 AM • 2 Comments

Comments

Clive Robinson • April 29, 2025 11:06 AM

@ Bruce, ALL,

The quote from the article you give says,

“Instead, CaMeL treats language models as fundamentally untrusted components within a secure software framework, creating clear boundaries between user commands and potentially malicious content.”

That has been the “standard security model” for more than a couple of thousand years because,

“Simple parts do not give security, you have to build security with them”

It’s why I point out “secure messaging apps are not secure systems”, you have to assess the security of all the parts involved.

Scrx • April 29, 2025 5:11 PM

Everything old is new again. Mixing up data and control was just what the Cap’n Crunch whistle did – back in the days of POTS. Those who don’t remember, &c. S.

Schneier on Security

Applying Security Engineering to Prompt Injection Security

Comments

Leave a comment Cancel reply