Zubnet AILearnWiki › Transformer
Models

Transformer

The neural network architecture behind virtually all modern LLMs and many image/audio models. Introduced by Google in the 2017 paper "Attention Is All You Need," Transformers use self-attention to process all parts of an input simultaneously rather than sequentially, enabling massive parallelism during training.

Why it matters

Transformers are the architecture that made the current AI boom possible. GPT, Claude, Gemini, Llama, Mistral — they're all Transformers under the hood. Understanding this architecture helps you understand why models have the capabilities and limitations they do.

Related Concepts

← All Terms
← Tool Use Tripo →
ESC