Zero-shot / Few-shot: Definition & Meaning — AI Wiki

Zero-shot means asking a model to do a task with no examples — just the instruction. Few-shot means providing a handful of input-output examples in the prompt before the actual request. "Here are 3 examples of how to format this data... now do this one." The model learns the pattern from context alone, no training required.

Why it matters

Few-shot prompting is the fastest way to teach a model a new format or behavior. Need consistent JSON output? Show it three examples. Need a specific writing style? Give it samples. It's free, instant, and surprisingly powerful.

Deep Dive

The terms "zero-shot" and "few-shot" come from the machine learning research tradition, where "shot" refers to a training example. In classical ML, you needed thousands or millions of labeled examples to teach a model a new task. The revelation with large language models was that they could perform tasks with zero training examples (zero-shot) or just a handful of demonstrations in the prompt (few-shot). This is called "in-context learning," and it remains one of the most remarkable capabilities of modern LLMs — the model is not being retrained or fine-tuned when you give it examples in the prompt. It is recognizing patterns in its context and applying them on the fly.

When Zero-Shot Works

Zero-shot works best when the task maps cleanly onto something the model has seen extensively in training. Sentiment analysis, translation, summarization, simple classification — these are tasks the model has encountered in millions of variations during pre-training, so a clear instruction is often enough. "Classify this customer review as positive, negative, or neutral" will work zero-shot on any modern frontier model because the model deeply understands what classification, sentiment, and those labels mean. Where zero-shot falls apart is on tasks with unusual formats, domain-specific conventions, or ambiguous requirements. If you need the model to output data in your company's proprietary XML schema, a bare instruction is not going to cut it.

Few-shot prompting fills that gap. By providing 2–5 input-output examples before the actual request, you show the model exactly what you expect. The model picks up on the pattern — the format, the level of detail, the style, the edge case handling — and applies it to the new input. This is remarkably powerful for structured tasks. Need to extract entities from messy text into a specific JSON format? Show three examples of messy text mapped to clean JSON, then give it the new text. Need to convert natural-language dates ("next Tuesday," "the second week of March") into ISO 8601? Three examples will get you 95% of the way there. The model is essentially learning a function from your examples, and it is doing it at inference time with no gradient updates.

Quality Over Quantity

The quality of your few-shot examples matters more than the quantity. Three carefully chosen examples that cover different edge cases will outperform ten repetitive ones. If your task involves categories, include at least one example per category. If there are tricky boundary cases, include one. And the order of examples can matter — research has shown that models can be biased toward the label of the last example they see, so shuffling or balancing your examples is worth doing. One practical tip: include an example of what the model should do when the input is ambiguous or does not fit any category, because that edge case comes up constantly in production and an unguided model will just guess.

The Cost Trade-Off

There is a cost-quality trade-off to consider. Each few-shot example consumes tokens from your context window and adds to your API costs. Five examples at 200 tokens each is 1,000 tokens per request, which adds up at scale. Some teams start with few-shot prompting during development, measure which examples are actually improving results, and then try to distill the pattern into a clearer zero-shot instruction. Others use dynamic few-shot selection — storing a library of examples in a database and retrieving the most relevant ones for each specific input, which is essentially a lightweight form of RAG applied to prompt engineering. The sweet spot depends on your task complexity, your volume, and whether consistency or cost matters more for your use case.

Zero-shot / Few-shot

Why it matters

Deep Dive

When Zero-Shot Works

Quality Over Quantity

The Cost Trade-Off

Related Concepts