Alibaba's T-Head shipped Zhenwu M890, an AI accelerator explicitly framed as built for agent workloads โ€” long context, real-time model-to-model coordination, multi-step task execution with limited human intervention. Claimed 3x performance over the predecessor Zhenwu 810E. T-Head reports 560,000+ Zhenwu units shipped to date, 400+ external customers across 20 industries including automotive and financial services. Available through Alibaba Cloud's Bailian platform; rack-scale delivery in the Panjiu AL128 (128 M890 accelerators per rack). Roadmap: M890 now, V900 Q3 2027 (another ~3x expected), J900 Q3 2028. Alibaba simultaneously released Qwen 3.7-Max โ€” claimed to operate continuously for up to 35 hours on agent tasks without performance degradation. Process node, FLOPs, memory bandwidth, and NVIDIA H100/H200 comparison numbers are not disclosed in the announcement.

Agent-targeted silicon is now a discrete hardware category. NVIDIA shipped Vera on May 17 โ€” 88 Olympus cores, 1.2 TB/s memory bandwidth, the same "built for agents" framing โ€” to Anthropic, OpenAI, SpaceXAI, and Oracle. Alibaba ships Zhenwu M890 today with the same thesis. The shared technical claim: agentic workloads stress different parts of silicon than dense inference. Memory-bandwidth bound (long context, large tool-call traces). Fast inter-accelerator communication (multi-model coordination). Sustained throughput over hours rather than seconds (the 35-hour Qwen 3.7-Max number). The Panjiu AL128 packaging โ€” 128 accelerators per rack โ€” is the system architecture for that workload class: rack-level coordination is the unit of deployment, not single-card inference. The concrete deployment numbers (560K units, 400+ customers) put this past pilot stage. The long roadmap (V900 2027, J900 2028) is the bet on demand continuing.

Ecosystem read. Every major frontier lab now has a hardware story for agents. NVIDIA (Vera) โ†’ Anthropic/OpenAI/SpaceXAI/Oracle. Google (TPU plus the Blackstone JV, 500 MW by 2027) โ†’ multi-cloud third-party access. Alibaba (Zhenwu M890 + Bailian + Panjiu AL128) โ†’ Chinese enterprise market plus the 20-industry customer base. The agent-workload market is large enough that vertically-integrated silicon stacks make business sense. For China specifically, Alibaba's Zhenwu line plus the Huawei Ascend track plus SMIC fabrication capacity is the domestic-silicon answer to the stalled H200 deal we covered May 19 โ€” 750K H200 GPUs licensed to Chinese buyers, zero shipped, Beijing-side block. Alibaba does not need NVIDIA if Zhenwu V900 lands in Q3 2027 as promised. For US and EU builders considering agentic infrastructure, the closed-source proprietary silicon stacks are converging on the Vera/Zhenwu/TPU pattern. Open-stack alternatives (AMD MI400, Intel Gaudi 3, ARM-based custom) lag on agent-workload-specific optimization for now.

Monday: if you're capacity-planning agent infrastructure, the relevant question is not "what FLOPs?" but "what does the rack look like and what does it cost to run 35-hour agentic workloads?" The Panjiu AL128 hints at the answer: 128-accelerator rack-level coordination is the deployment unit. For builders with Chinese end-users, Alibaba Cloud Bailian plus Zhenwu M890 is now a real production option, not a pilot. For US and EU builders, watch NVIDIA's next earnings call: with H200 China revenue effectively zero and Vera shipping to top labs, NVIDIA's pricing flexibility on Vera versus Zhenwu M890 will tell you whether NVIDIA competes on agent-silicon price or differentiates on ecosystem (CUDA, NCCL, MCP integration, Anthropic/OpenAI customer references). The next 12 months are when "agent-targeted silicon" stops being a marketing claim and starts being a measurable benchmark line. Watch for an MLPerf or equivalent benchmark suite for sustained multi-hour agent workloads โ€” that's the eval gap right now, and the vendor that wins the benchmark wins the procurement cycle.