Sampling: Definition & Meaning — AI Wiki

從模型預測的機率分佈中選下一個要生成的 token 的過程。貪婪解碼總是選最可能的 token。隨機採樣按機率比例選。Temperature、top-p(nucleus)、top-k 是調整選擇隨機性和多樣性的控制。採樣策略大幅影響輸出的品質、創造性、一致性。

為什麼重要

採樣參數是控制 LLM 行為最可觸及的旋鈕。Temperature 0 用於確定性程式生成。Temperature 0.7 用於創意寫作。Top-p 0.9 是個好平衡。這些不是魔法數字 — 它們直接控制模型在每一步考慮哪些 token。理解採樣能幫你為你具體的用例調整輸出。

Deep Dive

The sampling pipeline: (1) model produces logits for all vocabulary tokens, (2) temperature scaling divides logits by T, (3) top-k filtering keeps only the k highest logits (setting rest to −∞), (4) top-p filtering keeps the smallest set of tokens whose cumulative probability exceeds p, (5) softmax converts filtered logits to probabilities, (6) a token is randomly sampled from this distribution. Steps 3 and 4 are optional and can be combined.

Choosing Parameters

For factual/code tasks: temperature 0 (or very low), no top-p/top-k. You want the most likely tokens. For creative writing: temperature 0.7–1.0, top-p 0.9–0.95. You want diversity without incoherence. For brainstorming: temperature 1.0+, wider top-p. You want surprising, unexpected connections. The key insight: there's no universal best setting. Different tasks need different sampling strategies, and the optimal parameters also vary by model.

Beyond Simple Sampling

進階 strategies include: beam search (maintain multiple candidate sequences, pick the overall best — good for translation, less useful for open-ended generation), contrastive decoding (boost tokens where a large model outperforms a small model), and min-p sampling (dynamic threshold that keeps tokens with probability above a fraction of the top token's probability). These techniques address specific failure modes of simple sampling, like repetition loops or degenerate outputs.

Sampling

為什麼重要

Deep Dive

Choosing Parameters

Beyond Simple Sampling

相關概念