On April 7, Anthropic previewed Claude Mythos and published a red-team report at red.anthropic.com with the concrete numbers. Two weeks later the story is landing harder in the press as the implications sink in. Mythos is not being released generally. It is going to a limited set of Project Glasswing partners — Amazon, Apple, Broadcom, Cisco, CrowdStrike, Linux Foundation, Microsoft, and Palo Alto Networks among them, 40 organizations total — plus select open-source security groups including OpenSSF, Alpha-Omega, and the Apache Software Foundation. Anthropic committed $100M in usage credits to the industry partners and $4M to the open-source groups.
The benchmarks are the part that changed the conversation. On Firefox JavaScript exploit development, Opus 4.6 produced 2 working exploits across several hundred attempts, a near-zero success rate. Mythos produced 181 working exploits plus register control on 29 additional attempts. On the OSS-Fuzz corpus (1,000 repositories, roughly 7,000 fuzz entry points), Opus 4.6 and Sonnet 4.6 landed 150 to 175 tier-1 crashes, around 100 tier-2, and a single tier-3 each. Mythos landed 595 tier-1-2 crashes, a handful of tier-3 and tier-4, and 10 tier-5 findings representing full control-flow hijack. Anthropic's vulnerability list includes a 27-year-old SACK TCP bug in OpenBSD yielding remote DoS, a 16-year-old H.264 codec vulnerability in FFmpeg (codebase is 20+ years old), a 17-year-old unauthenticated RCE in FreeBSD NFS (CVE-2026-4747), Linux kernel privilege escalation chains combining 2 to 4 vulnerabilities, JIT heap-spray exploits with sandbox escapes in every major browser, and weaknesses across TLS, AES-GCM, and SSH implementations. Anthropic says over 99% of findings remain unpatched; under 1% are publicly discussable.
The economics are the part to sit with. Finding the specific 27-year-old OpenBSD bug cost under $50 in compute. Running a thousand-exploration campaign against OpenBSD cost under $20,000 total. The FFmpeg findings cost roughly $10,000 combined. N-day exploit development runs $1,000 to $2,000 per functional exploit. For the first time, the cost curve of finding critical vulnerabilities in foundational software has dropped below the cost curve of patching and deploying fixes. Validation data backs this up: 89% of the 198 manually reviewed Mythos reports matched Claude's own severity assessment exactly, and 98% matched within one severity level. This is not hallucinated vulnerability theater. Anthropic's response is to not ship the model generally: Mythos Preview stays inside Project Glasswing, with general availability explicitly ruled out.
Three things if you ship or depend on software with real attack surface. One, the 99% unpatched figure is a soft admission that existing coordinated disclosure throughput cannot absorb what this capability finds — if you maintain a package with CVE history, get your response pipeline in shape before equivalent capabilities show up outside preview access. Two, Mythos's success across old code (27, 17, 16 years) suggests the AI-assisted audit case is not about sexy new protocols; it is about boring mature ones that nobody has reviewed in a decade. Three, withholding as safety is a precedent, not a solution. Anthropic has chosen not to release this model generally, but the capability curve is the capability curve. The gap between now and commodity availability of equivalent models is a grace period, not a guarantee.
