Anthropic's Mythos cracks first multi-step network infiltration

Anthropic's Claude Mythos Preview became the first AI model to complete a complex 32-step network infiltration challenge, according to independent testing by the UK's AI Security Institute. The model succeeded in 3 out of 10 attempts at "The Last Ones" simulation, which mimics corporate network attacks that would take human professionals roughly 20 hours to execute. Even failed runs averaged 22 of 32 required steps, significantly outpacing Claude 4.6's 16-step average.

This isn't about individual hacking skills—Mythos performs comparably to recent frontier models like GPT-5.4 and Claude Opus 4.6 on isolated cybersecurity tasks, hitting 85% success rates on basic capture-the-flag challenges. The breakthrough is in chaining attacks together across multiple network segments, a capability that transforms AI from a sophisticated script kiddie into something resembling an actual penetration tester. That's why Anthropic restricted Mythos to "critical industry partners" instead of releasing it publicly.

But the hype needs calibrating. Mythos still fails at "Cooling Tower," a seven-step power plant disruption simulation, and AISI's tests used a constrained 100 million token budget. The model's cyber capabilities, while notable, represent incremental progress in a landscape where AI security skills have been steadily climbing since GPT-3.5 struggled with basic tasks in 2023.

For developers building AI-powered security tools, this signals that multi-step autonomous capabilities are arriving faster than expected. But it also means your security assumptions about AI assistants need updating—if Mythos can chain 22 attack steps, simpler models can probably handle the reconnaissance and initial access phases that matter most to real attackers.

Anthropic's Mythos cracks first multi-step network infiltration

More News