Zubnet AI學習Wiki › Classification
基礎

Classification

Classifier, Categorization
把一個輸入分配到一組預定義類別中的某一個的任務。「這封電子郵件是垃圾郵件還是不是?」(二元分類)。「這張圖是貓、狗、還是鳥?」(多類別)。「這些 tag 裡哪些適用於這篇文章?」(多標籤)。分類是最常見的監督式學習任務,也是無數真實世界 AI 應用的基礎。

為什麼重要

分類是大多數人第一次在實踐中遇到機器學習的地方 — 垃圾郵件過濾、內容審核、醫學診斷、詐欺偵測、情感分析。理解分類能幫你理解整個監督式學習流水線:輸入有標籤資料,訓練出模型,輸出預測。

Deep Dive

A classifier outputs a probability distribution over classes. For binary classification, a single number between 0 and 1 suffices (the probability of the positive class). For multi-class, the model outputs a probability for each class, typically using a softmax function to ensure they sum to 1. The predicted class is usually the one with the highest probability, but you can adjust the decision threshold based on your tolerance for false positives vs. false negatives.

LLMs as Classifiers

Modern LLMs are surprisingly good classifiers. Instead of training a dedicated model, you can prompt an LLM: "Classify this customer review as positive, negative, or neutral." For many classification tasks, this zero-shot approach matches or exceeds purpose-built classifiers, especially when the task requires understanding nuance or context. The trade-off is cost and latency — an LLM API call is much more expensive than running a small classifier locally.

Metrics That Matter

Accuracy (percent correct) is the most intuitive metric but can be misleading. If 99% of emails are not spam, a model that always predicts "not spam" gets 99% accuracy but catches zero spam. Precision (of predicted positives, how many are correct), recall (of actual positives, how many were found), and F1 (harmonic mean of precision and recall) give a more complete picture. The right metric depends on the cost of errors in your specific application.

相關概念

← 所有術語
← Checkpoint CLIP →