Snowflake signed a five-year, $6B commitment for AWS Graviton ARM-based CPU capacity. The strategic split named in the announcement is "while GPUs handle training and reasoning, CPUs handle most of the rest of the tasks associated with AI, particularly agents." That is the builder-frame story under the headline economic number — hyperscale customers are buying CPU at supply-commitment scale because agent workloads are CPU-bound, not GPU-bound. The $1.2B/year run rate puts ARM CPUs in the same procurement category as the GPU contracts that dominate the AI capex conversation.

The architectural reason ARM CPUs match the agent stack is workload shape. Most of an agent's wall-clock is tool dispatch, retrieval orchestration, JSON parsing, validation logic, prompt assembly, and the dozen-step state machine that wraps one LLM inference call. The single inference call wants GPU memory bandwidth; the eleven steps around it want low-latency CPU cycles at scale. AWS Graviton's price-performance positioning has been validated across general server workloads for years, but the agent stack is where the same economics start to apply to AI-tagged spend. The Snowflake commitment is also a Cortex AI signal — their text-interface-to-database product is the kind of agent workload that lives mostly on CPU with intermittent GPU calls.

The ecosystem read for builders: the cloud-CPU vs Nvidia-GPU framing in the press is the wrong dichotomy. The right read is "agents are CPU-heavy with GPU bursts" — and the ratio depends on which step of the agent loop you instrument. The hyperscalers (AWS Graviton, Azure Cobalt, Google Axion) are positioning ARM as the substrate for the CPU-heavy share of AI spend, which is structurally larger than the GPU-heavy share for any application past simple chat. No head-to-head benchmarks against Nvidia GPU on agent-loop wall-clock are in the announcement, which is the methodology gap to flag. The argument is economic-architectural, not benchmark-validated. Snowflake's $6B commitment is a vote that the economic case is strong enough to procure on without waiting for public benchmarks.

If you build agent infrastructure Monday morning: measure your actual CPU-to-GPU ratio in agent wall-clock, then pick instance types accordingly. The "AI workload = GPU instance" assumption costs money on agent-heavy services. If you sell agent platforms: the per-token economics conversation with enterprise customers is shifting from raw inference cost to total agent-loop compute mix, and ARM CPU pricing is part of that pitch.