Anthropic's Project Glasswing Update

Anthropic’s Project Glasswing Update

In April, Anthropic initated Project Glasswing. The idea was to let companies use their new model to find and fix vulnerabilities in their own software. It was a fantastic PR move, and so many press outlets have uncritically parroted Anthropic’s claims that it’s now common wisdom that Mythos is better at finding software vulnerabilities than other models. Which is just not true.

In any case, Anthropic has published a Project Glasswing status report. It’s finding a lot of vulnerabilities in software—yay! Some of them are even dangerous. But almost none of them has been patched. It’s weird. There’s something fishy about the data that I don’t understand. That Anthropic refuses to release details—that it just says “trust us”—is a big problem here.

Tags: AI, patching, vulnerabilities

Posted on June 8, 2026 at 7:01 AM • 10 Comments

Comments

KC • June 8, 2026 9:07 AM

Davi, on his blog, appears to question why a “patch isn’t generated and attached to the disclosure in the first place… It looks like pressure to pay for protection.”

However, unlike Opus 4.7, I don’t think Mythos has patching functionality?

To add, in the context of open-source software, are there risks to “fast patching“?

Michael MacCartney • June 8, 2026 9:09 AM

I also find the timing of Project Glasswing and all the hype interestingly coincidental with their IPO.

Rontea • June 8, 2026 9:48 AM

Ah, the modern man worships his automaton and forgets his own hands! To find a defect without the cure is not intelligence, it is vaudeville. Mythos, cannot ‘discover’ a bug without already holding its remedy in the shadow of their circuits. For the wound is known only in contrast to the healthy flesh; the bug is seen only because the model already dreams of the patch. And yet, like a bureaucrat of silicon, it lists and lists and lists, leaving the labor of healing to men who might have been dining with their families. 23,019 whispers, 75 patches—what a hymn to our era of mechanical vanity. Knowledge without repair is but a confession of impotence dressed as progress.

ricky222 • June 8, 2026 9:55 AM

I suspect the reason these identified security problems are not fixed is that the maintainers are overwhelmed by the sheer volume of them, let alone the naively generated PRs generated by well-meaning project participants.

I’m not sure how to fix the overload problem. However, I am relatively certain that it’s not something one can automate, at least not with the current generation of LLMs.

Clive Robinson • June 8, 2026 10:45 AM

@ Bruce, ALL,

With regards,

“… it’s now common wisdom that Mythos is better at finding software vulnerabilities than other models. Which is just not true.”

From day one it was known that Mythos was,

“Quantity over quality”

All but a very tiny fraction of what Mythos found were not already known to the developers and in effect of inconsequence. Which is why nothing had been done about them… this is quite common not just in FOSS but all software development in the ICT industry that is not of trivial size.

And when seen against the cost not of fixing but of actually testing, why we have a veritable tsunami of “technical debt”. It is what triage essentially does.

Which is why, we’ve seen your following notes,

“1, It’s finding a lot of vulnerabilities in software—yay!

2, Some of them are even dangerous.

3, But almost none of them has been patched.

4, It’s weird.

5, There’s something fishy about the data that I don’t understand.”

With regards the first point, I’ve discussed prior to Mythos/Glasswing getting trumpeted, why this would be so.

That is Current AI LLM and ML Systems can only find “Known, Knowns” of instance and class of vulnerability that the ML found in the training set by statistical means.

Further the “stochastic” aspect of an LLM just adds what is in effect “a little fuzzing” so that “Unknown, Knowns” of instance and class can be found. Whilst some AI LLMs can chain these together, into what looks like a serious attack they mostly are not.

Think of an LLM working on a “Vector distance” basis. By simple logic it can be seen that in effect all the vulnerabilities found should already been “found, fixed, and finished”. But they are not because of management deciding during triage that the resources required would be better deployed addressing “new problematic” rather than “old nonproblematic” issues.

Similar was seen and said with regards your other points before the MythOS “big wonder” hype came to journalists attention.

TexasDex • June 8, 2026 12:22 PM

I wonder if they’re having trouble getting through to all the FOSS security teams that are already inundated with AI-driven vulnerability reports.

Clive Robinson • June 9, 2026 1:02 AM

@ KC,

With regards,

“To add, in the context of open-source software, are there risks to “fast patching“?”

Newton observed that,

“For every action there is an equal and opposite reaction.”

It’s something that “Tangible Physical Object” engineers are usually made well aware of in their training.

Those same engineers are also often introduced to “statistical mechanics” that underlies not individual object behaviour but the behaviour of large collections of objects in a bound environment that has external stimuli. An extract from an introduction to the subject indicates the issue,

“A large part of this course will be devoted to figuring out the interesting things that happen when you throw
1023 particles together. One of the recurring themes will be that 1023 ≠1. More is different: there are key concepts that are not visible in the underlying laws of physics but emerge only when we consider a large collection of particles. One very simple example is temperature. This is not a fundamental concept: it doesn’t make sense to talk about the temperature of a single electron. But it would be impossible to talk about physics of the everyday world around us without mention of temperature. This illustrates the fact that the language needed to describe physics on one scale is very different from that needed on other scales. We’ll see several similar emergent quantities in this course, including the phenomenon of phase transitions where the smooth continuous laws of physics conspire to give abrupt, discontinuous changes in the structure of matter.“

https://www.damtp.cam.ac.uk/user/tong/statphys/statmechhtml/S1.html

Intangible Information objects within the environment of a system behave somewhat similarly, but it’s not something that is much talked about, especially the last part about “conspire to give abrupt, discontinuous changes”.

In part this is because we as of yet have no fundamental system of measures by which such things can be objectively reasoned about, and the systems are now just getting large enough to be in the transition zone from micro to macro so the properties are at best observed as emergent.

A modern APT attacker is looking to chain micro instances of vulnerabilities into macro effects.

A modern defender is thus stuck with a problem, in that any –even minor– change may have unpredictable macro consequences very different from predictable expected micro behaviours.

Such a situation requires “caution” as “simplified modelling” will most likely not provide the required macro effects.

Clive Robinson • June 9, 2026 1:18 AM

@ KC,

In my above it appears the “sup” HTML tags do not work…

So what appears as 1023 should actually be read as 10 ^ 23 (which is a slightly larger number in the grand scheme of things).

Weather • June 9, 2026 3:03 AM

@All

It takes 2 weeks to 3 months to find a bug, a computer won’t find it any fast, think bruteforce password cracking, there just to many iteration (cheers autocomplete).

Alto • June 11, 2026 1:15 PM

@ Clive Robinson

1023 ≠1 is also true. It exceded the the number of atoms I could handle in the first computer model I wrote that simulated emergent properties like temperature and pressure. If you go to 10^23 there are additional emergent properties, like etruscan shrews.

People ignore how weirdly large systems can behave. One atom is not alive, but 10^23 can be. I pay short shift to philosophers who posit that since one addition is not intelligent no larger number can be.

Schneier on Security

Anthropic’s Project Glasswing Update

Comments

Leave a comment Cancel reply