AlexNet: Definition & Meaning — AI Wiki

Le réseau de neurones convolutif qui a gagné la compétition ImageNet 2012 par une marge massive, déclenchant la révolution du deep learning. Créé par Alex Krizhevsky, Ilya Sutskever et Geoffrey Hinton, AlexNet a réduit le taux d'erreur de classification d'images de 26 % à 16 % — un écart si grand qu'il a convaincu la communauté de vision par ordinateur que le deep learning était fondamentalement supérieur aux features faites à la main.

Pourquoi c'est important

AlexNet est le moment « avant et après » dans l'histoire de l'IA. Avant 2012, la plupart des chercheurs IA travaillaient sur le feature engineering et des méthodes non-neurales. Après AlexNet, le deep learning est devenu le paradigme dominant. Chaque système IA moderne — GPT, Claude, Stable Diffusion — trace sa lignée au changement de paradigme qu'AlexNet a déclenché. C'est le Big Bang de l'IA moderne.

Deep Dive

AlexNet's architecture was relatively simple by modern standards: 5 convolutional layers, 3 fully connected layers, ReLU activation, max pooling, and dropout. The total parameter count was ~60 million. What made it special was training on GPUs (two GTX 580s with 3GB VRAM each — tiny by today's standards), using data augmentation, and being applied to ImageNet's 1.2 million training images — a scale that previous neural approaches hadn't attempted.

The Three Key Ingredients

AlexNet's success came from three things that are now obvious but were revolutionary in 2012: (1) large dataset (ImageNet, 1.2M images), (2) GPU training (making the computation feasible), and (3) deep architecture with ReLU (avoiding the vanishing gradient problem that had limited earlier networks). These three ingredients — data, compute, and architectural innovation — remain the recipe for AI breakthroughs today, just at a much larger scale.

The Aftermath

AlexNet's impact was immediate and permanent. Within a year, every competitive ImageNet entry was a deep CNN. Within three years, VGGNet and GoogLeNet pushed deeper. ResNet (2015) reached 152 layers. The computer vision community pivoted almost entirely to deep learning, and the approach spread to NLP (word embeddings, then RNNs, then Transformers), speech, and eventually every AI domain. The co-author Ilya Sutskever went on to co-found OpenAI.

AlexNet

Pourquoi c'est important

Deep Dive

The Three Key Ingredients

The Aftermath

Concepts liés