Gemini API ships webhooks for long-running jobs: HMAC + JWT modes, kills polling

Google has shipped event-driven webhooks for the Gemini API, eliminating polling for long-running jobs across three endpoint families: Batch (succeeded / cancelled / expired / failed), Interactions (requires_action / completed / failed / cancelled), and Video Generation (video.generated). The implementation follows the Standard Webhooks spec — HMAC-SHA256 signing for static (project-level) webhooks, asymmetric JWT (RS256) via JWKS for dynamic (per-request) webhooks, with `webhook-signature`, `webhook-id`, and `webhook-timestamp` headers. At-least-once delivery, 24-hour retry window with exponential backoff. Recommended replay-attack window: reject payloads older than five minutes.

The two configuration modes matter. *Static* webhooks are configured at project level through a WebhookService API, fire for any matching event under that project, and use a shared secret for HMAC validation. *Dynamic* webhooks override at request time with a `webhook_config` payload and use asymmetric JWT signed by Google's keys (verifiable at `https://generativelanguage.googleapis.com/.well-known/jwks.json`). Dynamic mode also supports a `user_metadata` field for routing — the metadata round-trips back in the webhook payload, so a single endpoint can fan out by tenant, user, or workflow without storing job-ID state separately. Payloads are intentionally thin: status snapshots with pointers like `output_file_uri` (Batch) or `file_id` and `video_uri` (Video), not raw outputs. Servers respond `2xx` immediately and process asynchronously to avoid retry cycles. The `webhook-id` header gives you the deduplication primitive.

For ecosystem context, this brings Gemini API up to parity-plus on event delivery. OpenAI's Batch API and Anthropic's Message Batches both run async with status that you currently *poll* for — a 24-hour batch run means polling on a schedule and burning request budget waiting for completion. Webhooks invert that: your application sleeps until Google notifies you. For video generation specifically (Veo-style workloads can run tens of seconds to minutes per request), polling is even more wasteful. The static-vs-dynamic split is well-thought-out: static for "I have one team running batch jobs, give me one secret to verify all of them," dynamic for "I'm a SaaS running multi-tenant Gemini workloads and I need each tenant's webhook to land at their isolated endpoint with their metadata attached." That's a useful capability-shape to copy if you're building anything similar.

Practical reads. If you're running Gemini Batch or video jobs and currently polling, switch — this saves API calls, reduces tail latency on completion detection, and is structurally cleaner. The 5-minute replay window plus `webhook-id` deduplication means your webhook handler needs an idempotency layer (most do already). For multi-tenant builders, dynamic webhooks with `user_metadata` is the right pattern — don't try to map static webhooks across tenants by parsing job IDs. Compare your current OpenAI / Anthropic polling overhead and decide whether it's worth pushing those providers to ship Standard Webhooks coverage too. The signal isn't "webhooks are new"; webhooks are old. The signal is that AI APIs are finally getting Standard Webhooks treatment, which means agent-and-batch infrastructure can be built with the same operational primitives as the rest of your stack.

Gemini API ships webhooks for long-running jobs: HMAC + JWT modes, kills polling

More News