Introduction
On April 7, 2026, Anthropic revealed its new AI Model named “Mythos” (currently in private mode) that aims to secure software in the AI era.
Anthropic claims that AI has reached a turning point in cybersecurity. With the expected release of Mythos, AI models are poised to be capable of identifying and exploiting software vulnerabilities at a level that rivals and, in many cases, surpasses top human experts. Mythos has already uncovered thousands of high-severity vulnerabilities across major operating systems and browsers, signaling rapid acceleration in both capability and risk.
Quickly in the wake of the Mythos announcement, Anthropic launched a coalition, named Project “Glasswing” after the clear-winged, tropical butterfly. The project, which includes over 40 major technology organizations such as Apple, Google, Microsoft, and Nvidia, is critical to redirecting this vast new LLM power toward defense rather than exploitation.
A recent example shared by Anthropic highlights the leap in capability: while the Opus 4.6 model was able to generate a working JavaScript shell exploit for a Firefox 147 vulnerability only two times out of hundreds of attempts, Mythos achieved a dramatically higher success rate, producing a working exploit in 181 cases. That’s not a marginal gain; it’s a fundamentally different level of capability.
Firefox JS shell exploitation
Percentage of trials model generated a successful exploit
Percentage of trials model achieved register control (but could not exploit)
In this article, we will provide a useful guide to allow you to better understand the announcement, what it means for application security leaders as well as some recommendations that you can learn from as you are moving forward with your AI journey.
Why did Anthropic Make this Announcement? And, Why Now?
For years, many software vulnerabilities have gone undetected because identifying and exploiting them requires highly specialized expertise. With the rise of advanced AI models, the barriers due to cost, effort, and skill have dropped dramatically, making both discovery and exploitation accessible, fast, and scalable. As you can see in Checkmarx’s own research below, the time to exploit a security vulnerability decreases dramatically with the power and adoption of AI.
Vulnerabilities that took weeks, months, or even years to exploit until recently, can now be weaponized in a matter of minutes. This defines an entirely new reality for application security, and it needs to be top priority for any head of security, head of engineering, and the entire executive team.
AI Speeds Weaponization of Vulnerabilities
Teams must now rush to investigate and determine which threats are most critical.
From Vulnerability to Exploitation
- The time between vulnerability disclosure and weaponization has essentially been eliminated.
- LLMs have been observed generating working CVE exploits in just 10–15 minutes at approximately $1 per exploit.
- By 2028 it’s projected to drop within 1 minute.
According to our annual Future of Application Security report, over 81% of organizations knowingly ship vulnerable code driven by overwhelming noise, uncontextualized backlogs, and limited resources. This is just one of several AI-driven challenges AppSec leaders must now confront. In the next section, we break down the most critical ones.
For additional perspective on how security is evolving with advances like Mythos, watch this industry discussion:
The Challenges of Adopting only AI Model-Based Security Solutions
As organizations accelerate toward AI-native development, the security landscape is shifting just as rapidly, and not always in predictable ways. New AI models are demonstrating an unprecedented ability to uncover vulnerabilities in existing codebases, including long-standing flaws that have gone undetected for years. At the same time, these models are dramatically lowering the barrier to exploitation, enabling faster weaponization of both known and unknown vulnerabilities. This creates a dual challenge: while discovery improves, the volume and velocity of risk increase just as quickly.
With that in mind, here are some of the key security challenges that are emerging in agentic development, that enterprises must acknowledge:
- New models are uncovering large volumes of zero-days in older code.
- AI Speeds weaponization of vulnerabilities: Known & unknown.
- A great reference to learn from is the recent article around the “jagged frontier” of AI security.
- As much as 45% of AI-generated code may be insecure.
- LLMs are missing vulnerabilities; coverage is incomplete.
- Models are trained from different sources, thus producing inconsistent results from one LLM to another.
- AI Models are not comprehensive enough and are lacking context.
Deterministic & Probabilistic AppSec for Known & Unknown Vulnerabilities
As the software landscape evolves into an agentic one and LLMs continue to advance, it’s critical to recognize that AI-driven security analysis alone is not sufficient. Models that generate, and flag issues based on probabilistic reasoning must operate alongside deterministic systems grounded in real-world context, customer environments, true exploitability, policy enforcement, auditability, and full visibility. At enterprise scale, this also means supporting thousands of repositories, distributed teams, and highly interconnected systems.
As highlighted in Tomasz Tunguz’s “Jagged Frontier of AI Security” article above, AI capabilities are not linear; they are inconsistent and context dependent. While models like Mythos can demonstrate breakthrough performance in discovering and exploiting unknown vulnerabilities, similar outcomes can often be reproduced by smaller models when given the right inputs. At the same time, known vulnerabilities, often buried in backlogs and lacking prioritization, remain a significant and weaponizable risk in the age of AI.
In this uneven reality, some vulnerabilities are identified with high precision, while others are missed entirely. This leads to false confidence, inconsistent outputs, and critical gaps in risk coverage. If detection isn’t consistent, it isn’t trustworthy.
If detection isn’t consistent, it isn’t trustworthy.
This is where a hybrid model becomes essential – AI with its probabilistic reasoning provides speed and scale, but it must be complemented by a deterministic security layer that validates findings based on context and real exploitability, and this is where we are focused. Brought together, probabilistic and deterministic approaches establish a new standard for agentic application security, one that delivers high-fidelity, actionable results at scale.
Making the case for Agentic Triage & Remediation
We’ve established that combining LLM-driven security which uncovers a wide range of unknown vulnerabilities with an already unmanageable backlog of known issues (leaving 81% of organizations exposed) requires a hybrid approach that blends probabilistic and deterministic analysis. But that alone is not enough.
The sheer volume of vulnerabilities now demands agentic triage and remediation. Manual processes cannot keep up; they fail to provide context, prioritize effectively, or resolve risk with confidence at scale.
This is where AI agents become critical. By automatically performing intelligent triage to eliminate noise and prioritize truly exploitable risk, and by driving fast, automated remediation, they bring together reasoning and precision. The result is security that is not only scalable, but truly actionable in an AI-native development environment.
Final Thoughts & Recommendations
The Mythos announcement and the formation of Project Glasswing mark a major milestone in AI-driven security, but they are not, and cannot be, a standalone solution. As outlined above, AI models both amplify existing risks and expose new ones, creating challenges that require a broader, more integrated approach.
To build a truly enterprise-grade, trustworthy AI security program, we recommend the following steps:
-
Hybrid AppSec Model
Combine deterministic precision with probabilistic AI to cover both known and unknown risk.
-
Agentic Triage & Remediation
Leverage AI agents to scale context-aware triage and accelerate remediation.
-
Shift Left to the Source
Identify and fix AI-generated vulnerabilities at code creation, before they reach production.
Learn more: https://checkmarx.com/product/application-security-platform/
Tags:
Agentic AI
Claude Mythos
security