Step R1-V Mini

StepFun 🧠 Language Model

StepFun's compact vision reasoning model — multimodal input with reasoning capabilities at lower cost.

Specifications

Context Window32,000 tokens
Max Output8,192 tokens
Speed Average
ModalitiesInput: text, image  ·  Output: text
FeaturesVision: Yes Tools: No Streaming: Yes

Pricing

Included with plan

Capabilities

reasoning
Use Step R1-V Mini on Zubnet →