Anthropic published a Project Glasswing update this week reporting that Claude Mythos โ its autonomous vulnerability-discovery LLM introduced April 2026 โ has identified 10,000+ flaws across 1,000+ open-source projects in roughly two months. The full numbers: 23,019 total issues, 6,202 rated high or critical severity, 1,752 validated to date with a confirmed true-positive rate exceeding 90%. Co-scanning partners include Cloudflare, Mozilla and others. Disclosure follows a 90-day embargo coordinated with maintainers โ the wolfSSL findings, for example, are patched but details remain withheld. For builders shipping anything on top of OSS dependencies, the takeaway is concrete: a non-trivial portion of the supply chain just got mass-audited, and the next 90 days will reveal the patch trail.
The architectural framing is deliberately thin in the release. Anthropic doesn't disclose whether Mythos is a standalone model, an agent harness, or a Glasswing-specific composition โ only that it scans, validates, and produces exploits autonomously. The 90%+ true-positive rate on 1,752 validated issues is the headline number worth focusing on; that's the rate above which a tool stops generating busywork and starts generating actual remediation queues. Compare with what Microsoft shipped earlier this week โ MDASH, a 100+ specialized-agent pipeline (scan/debate/validate/dedupe/exploit) scoring 88.45% on CyberGym at 1,507 real-world vulns โ and you have two of the biggest US AI labs landing autonomous-vulnerability-research releases within seven days of each other. The agent-driven vuln research category is real and is now competing in public, both internally-tested at very large scale (Microsoft on Windows/Hyper-V/Azure, Anthropic on 1,000+ OSS projects).
The access and safety framing is the part that builders need to read carefully. Mythos access today is partner-gated through Project Glasswing (AWS, Apple, Google, Microsoft, etc.) plus a public beta of "Claude Security" for enterprise customers. Anthropic explicitly states "no company has developed safeguards strong enough to prevent such models from being misused" and is holding "Mythos-class models" back pending stronger safeguards. That's an unusually direct admission. The honest read is that the same model that finds 10K vulns can also be used to write exploits at the same rate โ the partner-gating is the friction layer until alignment improves. For independent security researchers and small-shop builders, that means waiting; for enterprise security teams already on Glasswing or Claude Security, the capability is here now.
Monday morning: if you maintain an open-source project of any size, expect coordinated-disclosure email traffic over the next 90 days from Mythos-discovered findings โ Cloudflare and Mozilla are already in the loop. If you ship a product downstream of OSS dependencies, build the assumption into your patch cadence: the mass-audit of the supply chain is happening now, and the long tail of patches will land continuously over the rest of Q2. If you're evaluating autonomous-vuln-research tooling for your own pipeline, Mythos (gated) and MDASH (private preview) are the two reference points published this week โ the architectural pattern (autonomous scan-validate-exploit pipelines) is reproducible from public details even without access to either platform. The honest unaddressed question: how the 23,019 issues break down by category (memory safety, injection, auth bypass, logic bugs), since the public release only discloses the severity tier. Class-level data would let builders prioritise their own scanning.
