Zubnet AIसीखेंWiki › Transfer सीखेंing
Training

Transfer सीखेंing

एक task या dataset से सीखे knowledge को एक different लेकिन related task पर performance improve करने के लिए use करना। हर बार scratch से train करने के बजाय, आप एक ऐसे model से start करते हैं जो पहले से general patterns (language structure, visual features) समझता है और उसे अपनी specific need के लिए adapt करते हैं। Pre-training फिर fine-tuning modern AI में dominant paradigm है।

यह क्यों matter करता है

Transfer learning ही वजह है कि AI practical बनी। Scratch से एक language model train करने की cost millions of dollars है। अपनी specific task पर एक pre-trained model fine-tune करने की cost tens of dollars और कुछ घंटे है। यही economics ने AI applications का explosion enable किया — आपको कुछ useful build करने के लिए Google के budget की ज़रूरत नहीं।

Deep Dive

The key insight: low-level features transfer across tasks. A vision model trained on ImageNet learns to detect edges, textures, and shapes in its early layers — features useful for almost any visual task. A language model trained on web text learns grammar, facts, and reasoning patterns useful for almost any language task. Transfer learning exploits this by reusing the general knowledge and only training the task-specific parts.

The Pre-train + Fine-tune Paradigm

Almost every AI system today follows this pattern: (1) pre-train a large model on a massive, general dataset (expensive, done once), (2) fine-tune on a smaller, task-specific dataset (cheap, done many times). BERT pioneered this for NLP in 2018. GPT scaled it up. The entire LLM industry is built on this paradigm — foundation models are the pre-trained base, and fine-tuning (including RLHF/DPO) is how they become useful assistants.

When Transfer Fails

Transfer learning works best when the source and target domains are related. A model pre-trained on English text transfers well to French (similar structure) but poorly to protein sequences (completely different domain). When domains are too different, transfer can actually hurt performance ("negative transfer"). Domain-specific pre-training (like BioGPT for biomedical text or CodeLlama for code) addresses this by pre-training on domain-relevant data.

संबंधित अवधारणाएँ

← सभी Terms
← Tool Use Transformer →