S
Step R1-V Mini
StepFun
🧠 Language Model
StepFun's compact vision reasoning model — multimodal input with reasoning capabilities at lower cost.
Specifications
Context Window32,000 tokens
Max Output8,192 tokens
Speed●●●●● Average
ModalitiesInput: text, image · Output: text
FeaturesVision: Yes Tools: No Streaming: Yes
Pricing
Included with plan