Zubnet AIApprendreWiki › Classification
Fondamentaux

Classification

Classifier, Categorization
La tâche d'assigner une entrée à une des catégories prédéfinies. « Cet email est-il spam ou non ? » (classification binaire). « Cette image est-elle un chat, un chien, ou un oiseau ? » (multi-classe). « Lesquels de ces tags s'appliquent à cet article ? » (multi-label). La classification est la tâche d'apprentissage supervisé la plus commune et le fondement d'innombrables applications IA du monde réel.

Pourquoi c'est important

La classification est là où la plupart des gens rencontrent d'abord le machine learning en pratique — filtres antispam, modération de contenu, diagnostic médical, détection de fraude, analyse de sentiment. Comprendre la classification t'aide à comprendre tout le pipeline d'apprentissage supervisé : données étiquetées en entrée, modèle entraîné, prédictions en sortie.

Deep Dive

A classifier outputs a probability distribution over classes. For binary classification, a single number between 0 and 1 suffices (the probability of the positive class). For multi-class, the model outputs a probability for each class, typically using a softmax function to ensure they sum to 1. The predicted class is usually the one with the highest probability, but you can adjust the decision threshold based on your tolerance for false positives vs. false negatives.

LLMs as Classifiers

Modern LLMs are surprisingly good classifiers. Instead of training a dedicated model, you can prompt an LLM: "Classify this customer review as positive, negative, or neutral." For many classification tasks, this zero-shot approach matches or exceeds purpose-built classifiers, especially when the task requires understanding nuance or context. The trade-off is cost and latency — an LLM API call is much more expensive than running a small classifier locally.

Metrics That Matter

Accuracy (percent correct) is the most intuitive metric but can be misleading. If 99% of emails are not spam, a model that always predicts "not spam" gets 99% accuracy but catches zero spam. Precision (of predicted positives, how many are correct), recall (of actual positives, how many were found), and F1 (harmonic mean of precision and recall) give a more complete picture. The right metric depends on the cost of errors in your specific application.

Concepts liés

← Tous les termes
← Checkpoint CLIP →