Zubnet AI学习Wiki › Induction Head
基础

Induction Head

在 Transformer 中发现的一个特定的两 attention head 电路,通过模式匹配实现上下文学习。如果模型在上下文前面见过“A B”模式,现在再次看到“A”,induction head 就预测“B”会跟着来。这个简单机制被认为是 LLM 从上下文示例学习的基本构建块。

为什么重要

Induction head 是机械可解释性里理解最好的电路 — Transformer 如何从学到的权重实现一个有用算法的具体例子。它们解释 few-shot prompting 为什么有效:当你给示例,induction head 检测模式并应用它。理解 induction head 为理解更复杂的学到行为提供基础。

Deep Dive

The circuit uses two heads across two layers. The first head (a "previous token head" in an earlier layer) copies information about which token preceded the current one. The second head (the actual "induction head" in a later layer) uses this information to complete patterns: if token B was preceded by A earlier in the context, and A appears again, the induction head boosts the prediction of B. This is a simple but powerful form of in-context learning.

Discovery and Verification

Olsson et al. (2022, Anthropic) identified induction heads through careful analysis of attention patterns in Transformers of various sizes. They observed a phase change during training: induction heads form suddenly, and their formation coincides with a dramatic improvement in the model's ability to do in-context learning. This suggests that induction heads are not just one of many circuits but a foundational capability that enables higher-level in-context learning.

Beyond Simple Patterns

Real-world in-context learning is more complex than "A B ... A → B." Models learn to generalize patterns: "capital of France is Paris, capital of Germany is Berlin, capital of Japan is..." requires understanding the abstract pattern, not just copying. Research suggests that more complex induction-like circuits build on the basic induction head mechanism, composing it with other circuits to handle abstraction and generalization.

相关概念

← 所有术语
← Image Generation Inference →