Sarvam Vision

Sarvam AI 🧠 Language Model

Sarvam's 3B multimodal vision model — parses documents, extracts text, and understands visual content in 23 languages. Designed for India's diverse document formats.

Specifications

Speed Fast
ModalitiesInput: text, image  ·  Output: text
FeaturesTools: No Streaming: No

Pricing

Included with plan

Capabilities

vision
Use Sarvam Vision on Zubnet →