Zubnet AI學習Wiki › Data Augmentation
Training

Data Augmentation

透過創造現有範例的修改版本來人為擴大訓練資料集的技術。影像:翻轉、旋轉、裁剪、顏色偏移。文字:改寫、回譯、同義詞替換。音訊:速度改變、雜訊注入。目標是教模型不變性 — 貓就是貓,無論影像被翻轉、變暗還是裁剪。

為什麼重要

資料增強是在資料有限時提升模型性能最便宜的方法。它透過給模型看每個範例的多個變體減少過擬合,教它關注本質特徵而不是表面細節。在電腦視覺裡,增強常規地免費提供 2–5% 精度提升。

Deep Dive

The key principle: augmentations must preserve the label. Flipping a cat image horizontally still shows a cat (valid augmentation). Flipping a "turn left" sign makes it a "turn right" sign (invalid augmentation). Choosing appropriate augmentations requires understanding what invariances matter for your task.

Modern Augmentation

AutoAugment and its successors (RandAugment, TrivialAugment) learn or randomize augmentation policies instead of hand-designing them. Cutout/CutMix randomly masks or mixes patches from different images. MixUp interpolates between pairs of examples, creating synthetic training points that smooth decision boundaries. These techniques are now standard in vision training pipelines.

AI-Powered Augmentation

With generative models, augmentation goes beyond geometric transforms. You can use LLMs to paraphrase text training data, use diffusion models to generate variant images, or use models to create entirely new training examples (synthetic data). The line between "augmentation" (modifying existing examples) and "synthetic data" (generating new examples) is blurring, and both are becoming essential parts of modern training pipelines.

相關概念

← 所有術語
← Cursor Data Centers →