Zubnet AI学习Wiki › Transfer 学习ing
Training

Transfer 学习ing

用从一个任务或数据集学到的知识改善另一个不同但相关任务的性能。不是每次从零训练,你从一个已经理解一般模式(语言结构、视觉特征)的模型开始,把它适配到你的具体需求。预训练然后 fine-tuning 是现代 AI 的主导范式。

为什么重要

迁移学习就是为什么 AI 变得实用。从零训练一个语言模型要几百万美元。在你的具体任务上 fine-tune 一个预训练模型只要几十美元和几小时。这种经济学让 AI 应用爆炸 — 你不需要 Google 的预算来造有用的东西。

Deep Dive

The key insight: low-level features transfer across tasks. A vision model trained on ImageNet learns to detect edges, textures, and shapes in its early layers — features useful for almost any visual task. A language model trained on web text learns grammar, facts, and reasoning patterns useful for almost any language task. Transfer learning exploits this by reusing the general knowledge and only training the task-specific parts.

The Pre-train + Fine-tune Paradigm

Almost every AI system today follows this pattern: (1) pre-train a large model on a massive, general dataset (expensive, done once), (2) fine-tune on a smaller, task-specific dataset (cheap, done many times). BERT pioneered this for NLP in 2018. GPT scaled it up. The entire LLM industry is built on this paradigm — foundation models are the pre-trained base, and fine-tuning (including RLHF/DPO) is how they become useful assistants.

When Transfer Fails

Transfer learning works best when the source and target domains are related. A model pre-trained on English text transfers well to French (similar structure) but poorly to protein sequences (completely different domain). When domains are too different, transfer can actually hurt performance ("negative transfer"). Domain-specific pre-training (like BioGPT for biomedical text or CodeLlama for code) addresses this by pre-training on domain-relevant data.

相关概念

← 所有术语
← Tool Use Transformer →