Anthropic's Claude Mythos Preview can autonomously find zero-day vulnerabilities and build working exploits across major operating systems and browsers, marking a dramatic capability jump from previous AI models. In internal testing against Firefox 147 JavaScript engine vulnerabilities, Mythos Preview succeeded 181 times where its predecessor Opus 4.6 managed just 2 working exploits. The model achieved complete control flow hijack on 10 separate targets from the OSS-Fuzz corpus, compared to Opus 4.6's single success across 7,000 entry points.

This isn't incremental progress—it's a phase change that collapses the traditional gap between vulnerability discovery and exploitation. I've written about Anthropic's agent work before, and this confirms they're building systems that operate autonomously at a level we haven't seen. The researchers didn't explicitly train for these capabilities; they emerged from general improvements in reasoning and code understanding. That's both remarkable and concerning—capabilities we didn't design for are appearing as side effects of making models smarter.

The security implications are immediate and serious. When professional security contractors reviewed 198 of the model's findings, they agreed with the severity assessment 89% of the time. The model found a 27-year-old denial-of-service vulnerability in OpenBSD's TCP SACK implementation, proving it can discover bugs that human auditors missed for decades. Anthropic is limiting access to "critical industry partners and open source developers," but this capability will inevitably spread to other models.

For anyone building with AI: this changes the security landscape permanently. The same reasoning improvements that make these models better at helping you write code also make them exponentially better at breaking it. We're entering an era where automated exploit generation operates at machine speed and scale." "tags": ["anthropic", "security", "zero-day", "exploits