Machine 学习ing: Definition & Meaning — AI Wiki

计算机科学的广阔领域,系统从数据中学习模式而非遵循显式规则。你不用通过列出猫的特征(四条腿、尖耳朵、胡须)去编程让计算机识别猫,而是给它看成千上万张猫的照片,让它自己找出模式。机器学习涵盖从简单的线性回归到驱动今天 AI 的深度神经网络的一切 — 监督学习(带标签的样本)、无监督学习(发现结构)和强化学习(试错)。

为什么重要

机器学习是我们今天所说的“AI”之下的基础。每个 LLM、每个图像生成器、每个推荐算法、每个垃圾邮件过滤器 — 都是机器学习。把 ML 理解为更宽泛的学科,能帮你看清深度学习的位置、经典方法仍然胜出的地方,以及为什么“AI”其实只是“变得很厉害的 ML”。

Deep Dive

Machine learning splits into three paradigms, and knowing which one applies saves you from reaching for the wrong tool. Supervised learning is the workhorse: you give the model labeled examples (this email is spam, this one isn't) and it learns a mapping from input to output. Classification, regression, translation, image captioning — if you have labeled data, supervised learning is almost certainly where you start. Unsupervised learning works without labels: it finds structure on its own. Clustering customers by purchasing behavior, reducing a 10,000-feature dataset to its most informative dimensions, detecting anomalous network traffic that doesn't match any known pattern. You use it when you don't know what you're looking for, which is more often than people admit. Reinforcement learning is the odd one out — the model learns by trial and error, receiving rewards or penalties for its actions. It's how AlphaGo beat the world champion, how robots learn to walk, and how RLHF aligns LLMs with human preferences. It's also notoriously hard to get right, which is why most production ML is still supervised.

Classical ML vs. Deep 学习ing

There's a persistent myth that deep learning has made classical ML obsolete. It hasn't. Logistic regression still beats a Transformer when you have 500 rows of tabular data, a clear set of features, and a need to explain your predictions to a regulator. Random forests and gradient-boosted trees (XGBoost, LightGBM) dominate Kaggle competitions on structured data for a reason — they're fast to train, hard to overfit, and their feature importances are interpretable. Deep learning shines when the data is unstructured (images, text, audio, video) and the features are too complex to engineer by hand. Nobody writes edge-detection filters anymore because convolutional nets learn better ones. Nobody writes grammar rules for translation because Transformers learn the mapping end-to-end. The skill is knowing which regime you're in. If your data fits in a spreadsheet, try XGBoost first. If it doesn't, that's when neural networks earn their complexity.

The Training Loop

Every ML project follows the same loop, whether you're training a spam filter or a 400-billion-parameter LLM. You start with data — collecting it, cleaning it, splitting it into training and test sets. Then you extract or learn features: in classical ML, this means engineering them by hand (word counts, pixel histograms, date features); in deep learning, the model learns its own features from raw input. You pick a model architecture, train it by minimizing a loss function on the training data, then evaluate it on held-out data to see if it actually generalizes. It almost never works the first time. So you iterate — more data, better features, different hyperparameters, a different architecture entirely. The gap between a textbook ML pipeline and a production system is mostly this loop, run hundreds of times with increasingly desperate experiments until something works well enough to ship.

Why Now

The ideas behind machine learning aren't new. Backpropagation was figured out in the 1980s. SVMs and random forests were mature by the early 2000s. What changed is that three things converged at the same time. First, data: the internet generated more labeled and unlabeled data than anyone knew what to do with. Second, compute: GPUs turned out to be accidentally perfect for the matrix multiplications that neural networks need, and cloud providers made those GPUs available by the hour. Third, algorithms: batch normalization, dropout, attention mechanisms, and better optimizers made it possible to train networks that were previously too deep and too unstable to converge. None of these three factors alone would have been enough. Plenty of data existed in the 1990s, but nobody had the compute to train on it. GPUs existed in the 2000s, but the algorithmic tricks to train hundred-layer networks hadn't been discovered yet. It took all three arriving together to trigger the current wave — and it's the reason ML went from academic curiosity to the most funded technology sector on the planet in under a decade.

Machine 学习ing

为什么重要

Deep Dive

Classical ML vs. Deep 学习ing

The Training Loop

Why Now

相关概念