Google DeepMind researchers used AlphaEvolve, an LLM-powered coding agent, to solve two previously unsolved problems in complexity theory. The system improved the state-of-the-art for the MAX-4-CUT problem's inapproximability limit and tightened bounds on average-case hardness of certifying random graph properties. Unlike typical AI math assistance, AlphaEvolve iteratively evolved code snippets through feedback loops, generating mathematical structures that could be automatically verified without human oversight.

This represents a shift from AI as a research assistant to AI as an active discovery partner in theoretical computer science. Most LLM math breakthroughs have been in competitive programming or solving known problems, not proving novel theorems. The key insight here is using code evolution rather than direct proof generation—letting the AI iterate on algorithmic approaches until it finds structures that satisfy mathematical constraints. It's a clever workaround for LLMs' reliability issues in formal mathematics.

What's missing from Google's announcement is honest discussion of limitations. How many problems did AlphaEvolve fail to solve? What's the success rate? The paper focuses on two victories without revealing the broader experimental scope. We're also not seeing code or reproducible examples, which matters for a tool supposedly advancing open mathematical knowledge. The "automatic verification" claim needs scrutiny—mathematical correctness often requires human judgment about proof validity, even when computational checks pass.

For developers, this suggests a promising direction: using LLMs to evolve algorithmic solutions rather than generate them directly. The iterative feedback approach could work for optimization problems beyond pure mathematics. But don't expect to download AlphaEvolve—Google hasn't announced public availability, and the computational requirements likely make this enterprise-only for now.