Command A Vision
Cohere
🧠 Language Model
Cohere’s first multimodal model capable of understanding and interpreting visual data alongside text.
Specifications
Context Window128,000 tokens
Max Output8,000 tokens
ModalitiesInput: text, image · Output: text
FeaturesVision: Yes Tools: No Streaming: Yes
Pricing
Included with plan