AlexNet: Definition & Meaning — AI Wiki

以巨大優勢贏得 2012 年 ImageNet 競賽、觸發深度學習革命的卷積神經網路。由 Alex Krizhevsky、Ilya Sutskever、Geoffrey Hinton 創造,AlexNet 把影像分類錯誤率從 26% 降到 16% — 差距大到讓電腦視覺社群相信深度學習從根本上優於手工設計的特徵。

為什麼重要

AlexNet 是 AI 歷史上的「之前和之後」時刻。2012 年之前,大多數 AI 研究者在做特徵工程和非神經方法。AlexNet 之後,深度學習成為主導範式。每個現代 AI 系統 — GPT、Claude、Stable Diffusion — 都把自己的血統追溯到 AlexNet 觸發的範式轉移。它是現代 AI 的大爆炸。

Deep Dive

AlexNet's architecture was relatively simple by modern standards: 5 convolutional layers, 3 fully connected layers, ReLU activation, max pooling, and dropout. The total parameter count was ~60 million. What made it special was training on GPUs (two GTX 580s with 3GB VRAM each — tiny by today's standards), using data augmentation, and being applied to ImageNet's 1.2 million training images — a scale that previous neural approaches hadn't attempted.

The Three Key Ingredients

AlexNet's success came from three things that are now obvious but were revolutionary in 2012: (1) large dataset (ImageNet, 1.2M images), (2) GPU training (making the computation feasible), and (3) deep architecture with ReLU (avoiding the vanishing gradient problem that had limited earlier networks). These three ingredients — data, compute, and architectural innovation — remain the recipe for AI breakthroughs today, just at a much larger scale.

The Aftermath

AlexNet's impact was immediate and permanent. Within a year, every competitive ImageNet entry was a deep CNN. Within three years, VGGNet and GoogLeNet pushed deeper. ResNet (2015) reached 152 layers. The computer vision community pivoted almost entirely to deep learning, and the approach spread to NLP (word embeddings, then RNNs, then Transformers), speech, and eventually every AI domain. The co-author Ilya Sutskever went on to co-found OpenAI.

AlexNet

為什麼重要

Deep Dive

The Three Key Ingredients

The Aftermath

相關概念