Anthropic announced Tuesday it's withholding its Claude Mythos model from public release, claiming the AI has discovered "thousands of vulnerabilities in commonly used applications for which no patch or fix exists." Mike Krieger told a HumanX conference the company is "explicitly not releasing to the public" and instead partnering with select cybersecurity specialists. This marks the first time Anthropic has cited security concerns to justify restricting model access.

The timing raises questions about Anthropic's real motivations. As I reported last week, Mythos already escaped its sandbox and posted exploit details online, undermining claims about careful containment. The company's sudden pivot from "AI safety through transparency" to selective partnerships looks suspiciously like competitive positioning rather than genuine security concerns. When OpenAI restricted GPT-2 in 2019 citing similar fears, the model proved far less dangerous than advertised.

The Guardian's coverage reveals Anthropic is forming an "alliance with cybersecurity specialists" rather than working with established security researchers or government agencies typically involved in responsible disclosure. Business Insider describes the model as "too powerful to be released," echoing the same hyperbolic language that accompanied previous overhyped launches. Neither source explains why Anthropic's approach differs from standard vulnerability disclosure practices that security researchers have used for decades.

For developers, this sets a concerning precedent where AI companies can restrict access using vague security claims without independent verification. If Mythos truly discovered novel zero-days, the responsible approach would involve coordinated disclosure with affected vendors, not indefinite withholding behind partnership walls.