Zubnet AILearnWiki › Inference
Infrastructure

Inference

The process of running a trained model to generate outputs. Training is learning; inference is using what was learned. Every time you send a prompt to Claude or generate an image with Stable Diffusion, that's inference. It's what costs providers GPU hours and what you pay for per token.

Why it matters

Inference cost and speed determine the economics of AI products. Faster inference = lower latency = better UX. Cheaper inference = lower prices = wider adoption. The entire quantization and optimization industry exists to make inference more efficient.

Related Concepts

← All Terms
← Ideogram Jina AI →
ESC