Transfer 學習ing: Definition & Meaning — AI Wiki

用從一個任務或資料集學到的知識改善另一個不同但相關任務的性能。不是每次從零訓練,你從一個已經理解一般模式(語言結構、視覺特徵)的模型開始,把它適配到你的具體需求。預訓練然後 fine-tuning 是現代 AI 的主導範式。

為什麼重要

遷移學習就是為什麼 AI 變得實用。從零訓練一個語言模型要幾百萬美元。在你的具體任務上 fine-tune 一個預訓練模型只要幾十美元和幾小時。這種經濟學讓 AI 應用爆炸 — 你不需要 Google 的預算來造有用的東西。

Deep Dive

The key insight: low-level features transfer across tasks. A vision model trained on ImageNet learns to detect edges, textures, and shapes in its early layers — features useful for almost any visual task. A language model trained on web text learns grammar, facts, and reasoning patterns useful for almost any language task. Transfer learning exploits this by reusing the general knowledge and only training the task-specific parts.

The Pre-train + Fine-tune Paradigm

Almost every AI system today follows this pattern: (1) pre-train a large model on a massive, general dataset (expensive, done once), (2) fine-tune on a smaller, task-specific dataset (cheap, done many times). BERT pioneered this for NLP in 2018. GPT scaled it up. The entire LLM industry is built on this paradigm — foundation models are the pre-trained base, and fine-tuning (including RLHF/DPO) is how they become useful assistants.

When Transfer Fails

Transfer learning works best when the source and target domains are related. A model pre-trained on English text transfers well to French (similar structure) but poorly to protein sequences (completely different domain). When domains are too different, transfer can actually hurt performance ("negative transfer"). Domain-specific pre-training (like BioGPT for biomedical text or CodeLlama for code) addresses this by pre-training on domain-relevant data.

Transfer 學習ing

為什麼重要

Deep Dive

The Pre-train + Fine-tune Paradigm

When Transfer Fails

相關概念