Using LLMs to Exploit Vulnerabilities

Interesting research: “Teams of LLM Agents can Exploit Zero-Day Vulnerabilities.”

Abstract: LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).

In this work, we show that teams of LLM agents can exploit real-world, zero-day vulnerabilities. Prior agents struggle with exploring many different vulnerabilities and long-range planning when used alone. To resolve this, we introduce HPTSA, a system of agents with a planning agent that can launch subagents. The planning agent explores the system and determines which subagents to call, resolving long-term planning issues when trying different vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and show that our team of agents improve over prior work by up to 4.5×.

The LLMs aren’t finding new vulnerabilities. They’re exploiting zero-days—which means they are not trained on them—in new ways. So think about this sort of thing combined with another AI that finds new vulnerabilities in code.

These kinds of developments are important to follow, as they are part of the puzzle of a fully autonomous AI cyberattack agent. I talk about this sort of thing more here.

Tags: academic papers, AI, cyberattack, cybersecurity, LLM, vulnerabilities, zero-day

Posted on June 17, 2024 at 7:08 AM • 20 Comments

Comments

Clive Robinson • June 17, 2024 7:52 AM

@ Bruce, ALL,

The LLMs aren’t finding new vulnerabilities. They’re exploiting zero-days—which means they are not trained on them—in new ways.

That is somewhat ambiguous.

If it’s actually a “zero day” then by the definition it’s unknown thus not in the LLM “weights”.

In part that is why the word “toy” appears in,

“Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems.”

And the zero-day definition of “unknown” in

“However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).”

But

“How unknown?”

Is an important question.

As I’ve noted in the past there are

1, “Instances of vulnerability”
2, “Classes of vulnerability”.

Thus if the LLM knows sufficient “Instances”, to find the “Class” they are in. Then finding new “Instances” in the “Class” is well within the “stochastic parrot” description.

Especially if other “instances” in “other classes” that have some commonality are within the LLM weightings.

It’s actually not a new idea. Most cyber-attacks are not actually new, and some go back hundreds of years in conventional physical world attacks, that have just been ported across.

So old sour wine in new shinny bottles.

Thus a secondary question arises,

“Can an LLM trained up with physical world attacks cross them over to information world attacks?”

I’m reasonably certain the answer is yes.

Because although it might look like the LLM has invented a new vulnerability attack, in reality it has not, just found commonality and added a little randomisation.

SHL • June 17, 2024 8:46 AM

@Bruce

“The LLMs aren’t finding new vulnerabilities. They’re exploiting zero-days”

Maybe you should stick with something you handle, for example cryptography? Although, as NSA told, you are not good in that either.

0day is (yet) unpublished vulnerability. Someone has discovered it, but vendor and larger audience are unaware of it. Until nobody has discovered that vulnerability there’s no 0day.
To be a 0day, someone needs to discover it.

Now, how can AI discover it? Easy – source code analysis + fuzzing.

Winter • June 17, 2024 9:03 AM

@SHL

0day is (yet) unpublished vulnerability. Someone has discovered it, but vendor and larger audience are unaware of it.

It might be helpful if you read the article that is the source of this information before you go on the attack.

The authors of the referred article define their terminology in their Abstract:

However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).

And @Bruce explains this with:

They’re exploiting zero-days—which means they are not trained on them—in new ways.

Your personal definition of 0day is utterly irrelevant to how the word is used in the article and in the comment by @Bruce. But I actually suspect the problem is that you did not read, or not understand, the article at all.

Clive Robinson • June 17, 2024 9:30 AM

@ SHL

Re : Zero Day and One Day.

“Until nobody has discovered that vulnerability there’s no 0day.
To be a 0day, someone needs to discover it.”

Hmm it depends on who you listen to.

For it to be an attack, obviously an attacker needs to have discovered the vulnerability, but that in no way makes it a 0-Day. It needs to be deployed or in play in some way for that to be the case.

However an attack once in play can be sufficiently well known simply by it’s effects, without the corresponding vulnerability actually used being known or even characterised (this happens with devastating Worm attacks some Bot-Net infectors and even some similar down in the grass APT attacks that get noticed).

Some call these 1-Day attacks others don’t use any name.

Part of the paper describes how giving their system an attack description but not code allowed the attack –or similar– to be found.

The paper does not say enough for me to be comfortable with saying what exactly happens with regards what is or is not in the LLM network.

However as I note above there is a loop hole in the way you view “Instances and Classes” of attack.

The LLM may not know of a specific instance of attack. But it can know not just of earlier instances but apparently unrelated instances.

The LLM in effect finds a pattern which becomes a filter. Semi-Random input that “rings through” can show other apparently new attacks. Where as they are actually a variation on existing known instances and the classes they form.

Which brings us to

“how can AI discover it? Easy – source code analysis + fuzzing.”

That is a very simple way at looking at just one way.

It does not need to do “source code analysis” at all. From an attack description it can do a “behavioural analysis” and then use just a simple guess and try method rather than fuzzing.

For instance all it really needs to do is a very “poor man” reverse engineering analysis from just observing the target software in normal use. Then having identified certain behaviours apply either known attacks or variations there of.

Mexaly • June 17, 2024 10:02 AM

If only we could build Internet 2.1 before Internet 1.1 becomes useless.

SHL • June 17, 2024 10:29 AM

Winter, seems like kids today can’t even write or comprehend or even SEARCH the info.

web_archive_org/web/20180131070511/h_t_t_p://markmaunder_com/2014/06/16/where-zero-day-comes-from/

Learn and NEXT time make yourself familiar with the subject and topic before you comment.

Sigh.

PHP • June 17, 2024 11:13 AM

IMHO a 0-day remains a 0-day until there is a patch available.
Else it stops being a 0-day the second it is discovered.

But 0-days are exploited. Thus the patch availability is the 0-day end date.

Winter • June 17, 2024 11:15 AM

@SHL

Learn and NEXT time make yourself familiar with the subject and topic before you comment.

Direct your complaints to the authors of the study. They give a clear definition for their use of 0day and reasons for it. It is their study, and their data, we are discussing. If you want a different word use, write your own articles.

Using a different terminology than the article when discussing it would be hugely confusing. And as the authors rigorously define their terminology, it would also be laughable.

Note too that the world, and language, have moved on during the last decade. 0day is not used today as it was used in the past.

SHL • June 17, 2024 11:31 AM

Winter, I really don’t care how those “researches” interpretate the 0day in their mind.
Among information security professionals there’s only one, the original interpretation.
This is highly respected blog and here at least the blog owner should use the original meaning of the 0day. That’s all.

Scaler • June 17, 2024 12:16 PM

Then comes unsupervised mutual adversarial ML and the circle is complete. Oh wait…

Winter • June 17, 2024 1:19 PM

@SHL

Winter, I really don’t care how those “researches” interpretate the 0day in their mind.

Then ignore their research, if you find their terminology that abhorrent. And if you think our host uses the WRONG WORDS, don’t read his work. The rest of the community will try to understand the repercussions of this research for our security.

There are no laws against the wrong use of words, so you are on your own here.

This is highly respected blog and here at least the blog owner should use the original meaning of the 0day.

That reminds me of religious disputes about the true meaning of sacred texts.

Not my cup of tea.

dbCooper • June 17, 2024 2:45 PM

Clive Robinson – Over the years have enjoyed and learned much from your contributions here. I am glad you are posting again.

Best wishes and godspeed to you in the health challenges.

dbCooper

Clive Robinson • June 17, 2024 3:26 PM

@ Winter

Re : Definition of zero-day

“Note too that the world, and language, have moved on during the last decade. 0day is not used today as it was used in the past.”

Yup it changes as I noted especially now some talk of 1-day…

The earliest “academic definition” of zero day I could find goes back to 2012 and is given in the later Rand Report under table 4.1 as

SOURCE: Bilge and Dumitras (2012).

And in the rand refs as

Bilge, Leyla, and Tudor Dumitras, “Before We Knew It: An Empirical Study of Zero-Day Attacks in the Real World,” CCS’12, October 2012.

As of October 24, 2014 available as

http://users.ece.cmu.edu/~tdumitra/public_documents/bilge12_zero_day.pdf

Which in it’s introduction states,

“A zero-day attack is a cyber attack exploiting a vulnerability that has not been disclosed publicly. There is almost no defense against a zero-day attack: while the vulnerability remains unknown, the software affected cannot be patched and anti-virus products cannot detect the attack through signature-based scanning.”

They go on to give a graph timeline which makes their explanation easier to get a grip on

Which has been copied up onto Wikipedia as

https://en.m.wikipedia.org/wiki/File:Vulnerability_timeline.png

This was once considered the definitive definition, and why I did my thinking about how you could actually go about addressing

“There is almost no defense against a zero-day attack”

Hence my discussions in the past about “fire drills” addressing a wide set of attacks, many of which are yet unknown. Which gave rise to me going on –some would say ad nauseam– about “Known Knowns” and “instances in classes” of attacks.

But also whilst,

“anti-virus products cannot detect the attack through signature-based scanning.”

Is true of “anti-virus” it is not true of other “signiture-based scanning” as I independently demonstrated with “Castles -v- Prisons”.

But as both you and I noted above over the past decade the definition of Zero-Day has certainly changed. Some think for the better some for the worse. This concurs with the view on Wikipedia of,

“Although the term “zero-day” initially referred to the time since the vendor had become aware of the vulnerability, zero-day vulnerabilities can also be defined as the subset of vulnerabilities for which no patch or other fix is available. A zero-day exploit is any exploit that takes advantage of such a vulnerability.”

So contrary to what has been claimed by others it appears there is no industry / professional / academic consensus on the issue of what constitutes a Zero-Day. Which I suspect is well known to the paper authors, which is why they went to the trouble of adding the definition they were using in their paper.

I’m reminded of a very old observation –from religion–

“Without doubt we can not have certainty.”

And older

“To seek is to find.”

Which encapsulates “Epistemic humility” in that our understanding of reality is a journey thus initially partial and subject to revision in light of new evidence or insights.

This perspective, acknowledges, or is at least the recognition that our knowledge at any one time is limited and fallible, and that mostly certainty is an unattainable goal.

Which is why we have the old saw about Physics,

“Physics is taught as a succession of lies, each more accurate than the others.”

If only ICTsec teaching was as effective 😉

noname • June 17, 2024 4:13 PM

Clive Robinson,

No meaning has changed. Just people who don’t know the field mystify 0-Day and give to it arbitrary meaning.

Wikipedia is full of errors. Refer to this. Remove the spaces.

www . rand . org / content / dam / rand / pubs / research_reports / RR1700 / RR1751 / RAND_RR1751 . pdf

“The term zero-day refers to the number of days a software vendor has known about the vulnerability (Libicki, Ablon, and Webb, 2015). Attackers use zero-day vulnerabilities to go after organizations and targets that diligently stay current on patches; those that are not diligent can be attacked via vulnerabilities for which patches exist but have not been applied.”

Clive Robinson • June 17, 2024 4:42 PM

@ noname, Winter, ALL

“No meaning has changed. Just people who don’t know the field mystify 0-Day and give to it arbitrary meaning.”

As I noted I was looking for the original definition.

And I mentioned a Rand report that was ‘in the chain’.

That is PR1024

You will note it considerably predates those you point to and is at variance with them.

So proving both mine and @Winters point.

But as I also mentioned we now have

“Zero-Day and One-Day”

As well as N-Day Used in the industry,

https://fieldeffect.com/blog/1-day-0-day-vulnerabilities-explained

(But note the zero-day definition given there is actually logically defective).

So the question arises,

“Surely you can not be claiming that things have not changed within ICTsec Industry over that more than decade in time?”

Clive Robinson • June 17, 2024 8:50 PM

@ Scaler, ALL,

Then comes unsupervised mutual adversarial ML and the circle is complete. Oh wait…

And have a think about the fact the coin has two sides…

I’m known for pointing out technology is agnostic to use, and that it is a “directing mind” that puts technology to any given use. Further that the decision as to if that use is good or bad is made by an observer often quite some time after the act of use.

I get the feeling you see the adversarial use as “bad” and yes it certainly could be and sadly in some cases probably will be.

But less obvious is that the same basic adversarial ML use could be good.

Warfare is usually considered bad, but it causes jumps in technology and sometimes even living standards. But also it “tests tactics” in ways that would otherwise not be possible.

One of the reasons people play chess is to improve their tactical skills without causing mayhem and destruction. From this the idea of war games developed, where ideas play out on a table top. This in turn gave rise to actual in the field war games with real people and real equipment and sometimes real action (ie rather than blow up a transformer an adjudicator throws the off switch to cut the power). Yes things do get some damage but usually nobody gets hurt and the damage is mostly easily reparable (apart from peoples pride 😉

Thus the work in this paper can be extended and the use of two or more systems designed by different teams could “war game” to push things forward faster.

It’s been said of war, that the looser learns more than the winner. It’s why you hear stories of Generals fighting the last war. The usual –but incorrect– examples given are the French maginot line and the battle of the bulge.

The battle of the bulge should have been a victory for the German forces. One major reason it was not was the unknown to them allied radio proximity fuse that had just gone into production. It ment that the atritian rate from artillery jumped massively and German troops got slaughtered rather than being able to push forward against week allied forces. The German loss of troops and equipment was effectively the end of the war in Western Europe, all the German’s could do from then on was retreat back using defensive tactics that they could not support.

https://warfarehistorynetwork.com/article/the-proximity-fuse-how-the-gunners-dream-finally-became-realized/

The point is sometimes the small and unexpected could have major and devastating effect. There is an old saying of

“The ship was lost for a ha’p’orth of tar”

Which is another view of something small and apparently inconsequential that being overlooked has a devastating effect.

Adversarial War gaming is a way to find such things and as a consequence address them.

We already know that two AI back in 2016 found a way to effectively secure their communications from a third,

https://arxiv.org/pdf/1610.06918v1.pdf

So we know that in a wargaming scenario AI can learn to find solutions from having a problem described to them.

So “good or bad” depends on the observer but either way new results occur at a rate greater than in a non adverserial system.

loon • June 18, 2024 2:34 AM

@noname and the others discussing ‘0-day’ : how is this a bone of contention? Given use of ware S_a, there exist n_S_a (no pun intended) possible ways to subvert it’s function. A patch or update or communication to users leads to a new (use of) ware S_b that now has n_S_b ways of subverting it. I posit that at no point ever are all n ways known to any set of humans apart from the most trivial cases.
The vendor, or anyone, getting knowledge about a specific path to subversion for S_x does not in itself change n_S_x. Any action by the vendor does not change n_S_x (just n_S_y) – the timing of an attack relative to the action of the vendor is thus only useful for assigning blame (‘S_x was attacked while S_y was available’).

If 0-day (note that this might be a point in time relative to a system state, or denote an attack vector by referencing their inception) were to mean (A)’A subversion the vendor is not aware of’, what would be the material difference to (B)’A subversion the vendor is aware of but has not done anything about’?

To that end, referencing the tree-in-forest question: What sound does a 0-day make if no one knows about it?

Clive Robinson • June 18, 2024 5:20 PM

@ dbCooper,

Thank you for the kind thoughts, hopefully the medical profession will keep me on the tight ropes for a while longer.

As for “giving back” as my father used to call it. He impressed on me from an early age that there was no such thing as “bad knowledge” only knowledge, and it was what others did with that knowledge that should be judged on a case by case basis as good or bad, and always remember that peoples view points changed, especially as the acquired knowledge and society changed.

Over the years I’ve expressed many opinions that I guess were early for their time thus regarded by some as being paranoid… Now perhaps not paranoid enough…

My regret is that not enough people heard / learned and thus were harmed in some way.

All I ask is people listen, consider and think further and pass the thoughts on.

Oh and if people find them of use they buy our host @Bruce two drinks, one for him acting as our host, and one that he could pass on should he and I ever stop having “near misses” as our paths cross.

Clive Robinson • June 18, 2024 5:45 PM

@loon

blockquote>”@noname and the others discussing ‘0-day’ : how is this a bone of contention?”

Ahh that depends on…

An attack vector is a form of bug that gets in as some kind of error in some part of the development cycle.

This may be in

1, The standards
2, The protocols
3, The tools
4, The development methods
5, The specifications
6, The implementation

And so on. That is these stages are “assumed to be in sequence” like the old “waterfall” notion, but reality intrudes and often they are not and may all to often go into a tail spin and stall, which means “crash and burn” is very much on the cards…

As we know anything at or above the specification are subject to apparently random in time changes, which turn the wheel.

To make sense of this people try to impose a serial approach via a “time line”.

If there was only one time line then the definitions would have more stability.

But each one of those stages listed can and do have an unknown number of time lines, that in themselves diverge for various reasons.

So the complexity goes up at quite a rate.

The demand for simple terminology thus gives rise to ambiguity under each term…

Think of it as the “looks like a dog” problem where dogs are so varied the definition gets simplified to the point that it is in effect meaningless to use to make tests from. Thus you end up with that old saying about

“You can not say it’s pornography from a factual description, ‘But you know it when you see it.'”

Clive Robinson • June 25, 2024 6:05 AM

“How unknown?”

Is an important question.

As I’ve noted in the past there are

1, “Instances of vulnerability”
2, “Classes of vulnerability”.

Thus if the LLM knows sufficient “Instances”, to find the “Class” they are in. Then finding new “Instances” in the “Class” is well within the “stochastic parrot” description.

Schneier on Security

Using LLMs to Exploit Vulnerabilities

Comments

Leave a comment Cancel reply