DeepMind dropped a year-one impact report this week on AlphaEvolve, the Gemini-powered coding agent it introduced in May 2025 to discover and optimize algorithms autonomously. The results are unusually concrete: 30% error reduction in DNA sequencing variant detection, 14%→88% feasible solutions on electricity grid optimization, 10× error reduction on Google's Willow quantum processor, and counterintuitive TPU circuit designs that ended up in silicon. For anyone wondering whether "AI does science" was marketing or substance, this is the substance side.
AlphaEvolve is an agentic system built on Gemini that discovers algorithms by iteratively generating, evaluating, and refining candidate solutions against a defined fitness function — closer in shape to evolutionary search wrapped in LLM proposal/critique than to chain-of-thought reasoning. The original 2025 paper described the mechanism; this week's release reports the deployed results. Specific gains: Spanner write amplification down 20%, compiler footprint down 9%, mathematician Terence Tao collaborated with it on Erdős problems and Ramsey number bounds, Traveling Salesman Problem bounds improved. Commercial customers: Klarna doubled transformer training speed; FM Logistic saved 15,000 km/year through routing optimization (10.4% efficiency gain); WPP got 10% better campaign modeling accuracy; Schrödinger got 4× speedup in ML force field training and inference. Google Cloud is the access path — no open-source release, no paper update, just a deployment-mode coverage update of methodology that was published last year.
Coding agents have mostly been positioned as developer productivity tools — Claude Code, Cursor, GitHub Copilot — generating code in service of human-defined problems. AlphaEvolve is in a different category: the human defines the problem, the agent searches the algorithm space until it converges on something better than what existed. Most of the gains in this report come from problems where there was already a known optimal-ish solution and AlphaEvolve found a better one — the TPU circuit designs being "counterintuitive" and shipping in silicon is the strongest signal that this isn't stitching together known tricks. For research labs, the implication is that algorithmic improvement may not stay the exclusive province of human researchers on problems with clean fitness functions. For everyone else: 20% lower Spanner storage cost, 10× cleaner quantum operations, and 30% fewer sequencing errors quietly compound into things that change downstream products without ever being announced.
AlphaEvolve is API-gated through Google Cloud, not open-source. The interactive gallery at alphaevolve-examples.web.app shows concrete cases without account requirements. If you have a hard optimization problem with a measurable fitness function — kernel-level performance, routing, circuit design, drug screening — this is the agent shape worth watching. If you're doing knowledge work where success is subjective, this isn't your tool. The bigger pattern to track: AlphaEvolve and OpenAI's recent ML-research-automation claims point at the same direction (agents doing algorithmic work, not just plumbing), and that's likely the next frontier of the agent race beyond "write me a Python script.
