Zubnet AI学习Wiki › Supervised 学习ing
Training

Supervised 学习ing

一种训练方法,模型从带标签的样本中学习 — 输入-输出配对,其中正确答案已经提供。“这是一张猫的图,标签是'猫'。这是一张狗的图,标签是'狗'。”模型调整参数,使其预测与已知正确答案之间的差异最小化。

为什么重要

监督学习是机器学习最直观的形式,仍然是大多数实际应用背后的主力:垃圾邮件过滤器、医学影像分析、欺诈检测,还有 LLM 的 fine-tuning 阶段。当你有带标签的数据和明确的目标时,监督学习通常是你的起点。

Deep Dive

The core loop of supervised learning is: make a prediction, compare it to the label, compute a loss (how wrong you were), and adjust parameters to reduce that loss. This cycle repeats millions or billions of times during training. The math behind the adjustment is gradient descent — computing how much each parameter contributed to the error and nudging it in the direction that reduces the error.

It's Everywhere in LLMs

Pre-training an LLM is technically a form of self-supervised learning (the labels are generated from the data itself — the "label" for each position is just the next token in the text). But fine-tuning and RLHF both use supervised signals: human-written example responses, or human preference rankings between model outputs. When you fine-tune a model on customer support conversations, you're doing supervised learning with the support agent's responses as labels.

The Data Bottleneck

The catch with supervised learning is that you need labeled data, and labels are expensive. Every medical image needs a radiologist to annotate it. Every support conversation needs a quality rating. This is why techniques like self-supervised learning (letting the model generate its own labels from unlabeled data) and semi-supervised learning (using a small labeled set to bootstrap labels for a larger unlabeled set) are so important — they reduce the labeling bottleneck that limits pure supervised approaches.

相关概念

← 所有术语
← Superposition SwiGLU →