LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days

Opus 4.6 is notably better at finding high-severity vulnerabilities than previous models and a sign of how quickly things are moving. Security teams have been automating vulnerability discovery for years, investing heavily in fuzzing infrastructure and custom harnesses to find bugs at scale. But what stood out in early testing is how quickly Opus 4.6 found vulnerabilities out of the box without task-specific tooling, custom scaffolding, or specialized prompting. Even more interesting is how it found them. Fuzzers work by throwing massive amounts of random inputs at code to see what breaks. Opus 4.6 reads and reasons about code the way a human researcher would—looking at past fixes to find similar bugs that weren’t addressed, spotting patterns that tend to cause problems, or understanding a piece of logic well enough to know exactly what input would break it. When we pointed Opus 4.6 at some of the most well-tested codebases (projects that have had fuzzers running against them for years, accumulating millions of hours of CPU time), Opus 4.6 found high-severity vulnerabilities, some that had gone undetected for decades.

The details of how Claude Opus 4.6 found these zero-days is the interesting part—read the whole blog post.

News article.

Tags: AI, LLM, zero-day

Posted on February 9, 2026 at 7:04 AM • 6 Comments

Comments

Javier Kohen • February 9, 2026 7:34 AM

When I first used AI for a code refactoring about a year ago it surprised me by rewriting seemingly unrelated portions of my handwritten code. I discovered later that it was fixing bugs in the business logic and error handling.

One traditional problem with AI has been that it is hard to explain its conclusions. As long as there has to be a human in the loop, and there certainly needs to be one on most cases still, we need to be partners with the AI. I’ve noticed the industry has been paying attention by making the models explain their thinking, which apparently both improves the outcome and our ability to interact with the AI.

Now that reminds me of a finding, I think by anthropic, where a model explained that it performed a basic math calculation using the elementary school method, but the researchers were able to look under the hood and confirm it was actually using a combination methods, including heuristics. Not unlike how the actual human brain would do small number math in the real world. So AI could easily become that lazy co-worker who never wants to explain their real motivations to rewrite your code.

K.S • February 9, 2026 7:47 AM

In the optimistic case for LLM assistants in coding, it will make all but senior coder positions obsolete by drastically increasing productivity. As the result, we will get better code faster and it will be reviewed/signed off on by competent people.

In the pessimistic case, it LLMs will churn code much faster and cheaper than any coder, so a lot of people in software development and QA will lose jobs. The resulting code will be different – LLMs will make new categories of security and optimization mistakes that after a short transition honeymoon period will result in exponential explosion of vulnerable systems.

Max • February 9, 2026 9:02 AM

C’mon man. That article is just breathless speculation on top of another article that is breathless speculation on top of the original blog post by Anthropic (makers of Opus) themselves: https://red.anthropic.com/2026/zero-days/

Notably, out of the supposed hundreds of zero days found, they only include info for three of them, and the OpenSC analysis in particular seems invalid, or at least it’s entirely unclear from the snippet posted if or how the vulnerability could even be exploited.

tfb • February 9, 2026 9:35 AM

Someone wrote

In the optimistic case for LLM assistants in coding, it will make all but senior coder positions obsolete by drastically increasing productivity.

So, let’s see. How do you get to be good enough at programming to be a ‘senior coder’? Well you start off by being bad at programming and you practice, a lot. How do you stay good at programming? You practice, a lot.

So in the ‘optimistic case’ all the non ‘senior coder’ jobs get eaten and nobody new gets to practice enough to become really good. The people who are lucky enough to already be good enough not to get laid off also don’t practice any more and slowly lose their skills.

This is the ‘optimistic case’ in the same sense that there are optimistic cases for thermonuclear war.

Winter • February 9, 2026 11:23 AM

@tfb

> In the optimistic case for LLM assistants in coding, it will make all but senior coder positions obsolete by drastically increasing productivity.

So in the ‘optimistic case’ all the non ‘senior coder’ jobs get eaten and nobody new gets to practice enough to become really good.

Reports about the death of coding are greatly exaggerated, in my view.

We have seen this happening time and again, starting from binary switches, to assembler, to FORTRAN & C, to 3rd and 4th generation languages. At every step programming got “easier”, that is, more abstract. Programmers could insert whole data structure and connected algorithms with a single phrase. A complete running HTTP server is a single line of Python.

The result was that we wrote larger programs with much more bells and whistles.

What remains in all these layers of abstractions is that someone has to specify what the application is supposed to do, and select how it is supposed to do it.

LLMs will not change that. If what you want is a boiler plate problem, LLMs can give you boilerplate solutions. If you want a standard web site, an LLM will give you the little Python code you need for that. If you want it to have a standard WordPress backend, nothing could be more easy than derive it from the millions of WordPress instances that litter the interwebs.

But the user still has to specify what they actually want, and how they want it to behave. It would indeed be a boon when we can do this in an understandable text format. But that text format still has to be specific for the aims of the application.

I think of LLM programming as just another higher layer of compiler working on an even more abstract level that we are used to.

Think Photography. It used to be that only a true artist could make a likeness image of a person or landscape. Photography made it possible for everyone to create likenesses of everyone or everything.

Now, we employ more photographers for making images than we ever employed artists, because, creating a good image still requires the eye of an artist.

Clive Robinson • February 9, 2026 12:02 PM

@ Bruce, ALL,

It’s a puff / hype piece.

The clue is in the section you quote above,

“looking at past fixes to find similar bugs that weren’t addressed, spotting patterns that tend to cause problems, or understanding a piece of logic well enough to know exactly what input would break it.“

It’s not realy “finding anything new”.

It’s just instances in existing classes by a random process.

I described on this blog quite some time ago how to do this and it does not need Current AI LLM and ML Systems to do it.

But further it’s very much a “Database” and “rules” system no “Intelligence or reasoning” required.

As I’ve explained before you have “instances in classes of vulnerability”

When a new instance is found it either falls into an existing class, or creates a new class so you end up with the vulnarabilty having being either,

1, An unknown instance in a known class.
2, An unknown instance in an unknown class.

As you can not have “a known instance without a Class for it to be in that just leaves, by far the largest number of vulnerabilits of,

3, An unknown instance in an unknown class”

That are yet to be discovered.

There are two basic ways to discover such unknown vulnerabilities

1, The luck of fuzzing.
2, The considered reasoning of some one who has an appropriate level of knowledge in the right area.

Now consider this, all instances of vulnerability are points on a line, plane or space depending on how you want to define things.

Around one or more points is an area of space that forms a class.

Thus you can see bubbles in the space inside of which is knowledge of what makes up the class and outside of which is a lot of space for either existing classes to expand into or new classes to be established.

Now if you have sufficient knowledge about an existing class you can expand it by selective fuzzing or what some used to call stressing.

The further the way you are from any class the more reasoning you have to do, but there are limits on this.

But whilst totally random fuzzing might find you a new vulnerability it takes considerable knowledge and reasoning to turn it into an exploitable vulnerability.

Which leaves the issue of “reasoning” current AI does the “Database and rules” aspect and random perturbation comes as part of their functioning. So by chance they can find a vulnerability.

But they lack the method to do anything that is not already known.

Thus there will be a very large part of the problem space they will not be able to be effective in finding actual new instances of attack, rather than variations on known attacks.

Consider why Alpha Fold and Similar systems function they are in effect a database and rules thus they went very rapidly through combinations. Not to dissimilar to “playing a game” like Chess or Go. But in essence “running on tracks” not “roaming with agency”.

This “automating vulnerability discovery” is just a small part of the process of actually coming up with “known instances” and “known classes” and does not require “intelligence” artificial or otherwise.

As for expansion of known classes it can use existing knowledge, reasoning it out is not necessary or required.

But the thing to note, is that an LLM will not do “new reasoning” unlike humans, so it can not move into the problem space untill an experienced human who can reason it out opens up the “existing knowledge” and it gets put via ML into the LLM.

If anyone want’s to write this up as a formal proof and publish it “be my guest”.

Schneier on Security

LLMs are Getting a Lot Better and Faster at Finding and Exploiting Zero-Days

Comments

Leave a comment Cancel reply