Zubnet AIAprenderWiki › Feature
Fundamentos

Feature

Aprendered Representation, Activation
Un patrón o concepto que una red neuronal aprende a detectar en su entrada. En visión, features de capas tempranas son bordes y texturas; features de capas posteriores son partes de objetos y objetos completos. En modelos de lenguaje, features van desde simples (la letra «a», un patrón sintáctico específico) a abstractos (el concepto de sarcasmo, una estrategia de razonamiento particular). Los features se representan como patrones de activación a través de neuronas.

Por qué importa

Los features son lo que los modelos realmente aprenden — no hechos individuales sino patrones que generalizan. Un modelo no memoriza «los gatos tienen pelaje»; aprende un detector de feature para texturas tipo pelaje que se activa para gatos, perros y osos de peluche. Entender los features ayuda a explicar el comportamiento del modelo: por qué generaliza (los features se transfieren), por qué falla (feature equivocado activado), y cómo mejorarlo (exponerlo a features más diversos).

Deep Dive

The term "feature" has different meanings depending on context. In classical ML, features are hand-engineered input variables (height, weight, age). In deep learning, features are learned representations in hidden layers — the model discovers useful patterns on its own. This shift from hand-engineered to learned features is the core innovation of deep learning and why it outperforms classical ML on complex tasks like vision and language.

Hierarchical Features

Deep networks learn hierarchical features: each layer builds on the previous one. In a vision model: layer 1 detects edges, layer 2 combines edges into textures and corners, layer 3 combines textures into object parts (eyes, wheels), layer 4 combines parts into objects (faces, cars). This hierarchy emerges automatically from training — no one programs it. The same hierarchical feature learning happens in language models, from character patterns to syntax to semantics to reasoning.

Feature Visualization

Researchers visualize features to understand what models learn. For vision models, you can generate images that maximally activate a specific neuron or direction, revealing what pattern it detects. For language models, you can find the text examples that most activate a specific feature direction. Anthropic's research has visualized features in Claude, finding interpretable concepts like "Golden Gate Bridge," "code bugs," "deception," and "French language" encoded as specific directions in the model's activation space.

Conceptos relacionados

← Todos los términos
← Existential Risk Federated Aprendering →