Grab built a production multi-agent system for engineering support on its Analytics Data Warehouse platform, deployed to 1,000+ internal users managing 15,000+ tables. The architecture: a supervisor agent controls communication and task delegation; specialized agents handle context retrieval, code search, and solution generation. Two primary workflows โ€” investigation (query analysis, SQL debugging, log retrieval, schema lookup) and enhancement (code fixes, automated merge requests). Built on LangGraph for the workflow engine and FastAPI services for routing, tool execution, and state management. Tool ecosystem consolidated from 30+ tools to a curated set: controlled SQL execution, metadata access, log retrieval, Git-based workflows. Specific models used not disclosed. Reported impact: "hundreds of engineering hours reclaimed each month" per Head of Analytics, plus a shift from reactive firefighting to platform development work.

The architectural choices are the part worth studying. First, **constrained agent responsibilities** โ€” each specialized agent has a narrow scope to reduce reasoning ambiguity. This is the same instinct as the proposal-execution split in this morning's agent security piece: limit what the agent can decide so the gates can verify what it does. Second, **human-in-the-loop on all code changes** โ€” no agent writes to production without review. Third, **SQL execution validation layers with sensitive-data protection** โ€” the agent doesn't run arbitrary SQL; it runs SQL through a controlled-execution gate that scrubs sensitive data and validates before running. Fourth, **structured context compression for multi-step reasoning within token limits** โ€” the long-context problem (15K tables means schema lookups blow context budget fast) is solved with explicit compression, not by relying on the model to figure out what's relevant. The 30-tools-to-curated reduction is the operational lesson: tool sprawl makes the agent less reliable, not more capable. Curation is the work.

Why this matters for builders. Production multi-agent deployments at this scale (1K users, 15K tables) with this concreteness (LangGraph + FastAPI stack named, human-in-the-loop named, tool consolidation named) are rare in public reporting. Most published agent case studies are demos or pilots. Grab's specifics tell you what production-grade actually looks like: the framework choice (LangGraph over AutoGen, CrewAI, or custom) is a real signal โ€” LangGraph's checkpointing and supervisor-pattern primitives are battle-tested for this use case. The Analytics Data Warehouse use case is also generalizable: anywhere you have a complex internal platform (data warehouse, internal API surface, infra automation) supporting many engineers with repetitive support load, the Grab pattern applies โ€” supervisor agent, specialized retrievers, controlled execution gates, human-in-loop on writes.

Monday: if you're considering a multi-agent engineering-support deployment, the Grab pattern is a strong template. Concrete starting points. (1) Pick LangGraph for the orchestration layer if your team doesn't already have a strong opinion โ€” the supervisor-pattern primitives map cleanly to investigation/enhancement workflow splits. (2) Audit your existing internal tool surface; the 30-to-curated reduction at Grab is the lesson โ€” too many tools makes the agent worse. Start by naming the 5 to 10 tools that cover 80% of support load and build from there. (3) Set up the controlled-execution gate before you let the agent write anything to production. SQL execution validation + sensitive-data scrubbing is the specific pattern Grab uses; the general pattern is policy-checked execution that's non-bypassable. (4) Plan human-in-the-loop on all code changes from day one โ€” retrofitting it later is much harder than building it in. (5) Measure "engineering hours reclaimed" as your primary metric, not "tickets resolved" or model accuracy โ€” the business case for the agent is reclaimed engineer time, and that's what the Head of Analytics quoted at Grab. The eval metric should match the business metric.