Zubnet AILearnWiki › Decoder
Fundamentals

Decoder

Decoder Network, Generator
A neural network component that generates output from a representation. In Transformers, the decoder uses causal (left-to-right) attention to generate tokens one at a time. In image generation, the VAE decoder converts latent representations back into images. In autoencoders, the decoder reconstructs the original input from the compressed bottleneck. Decoders are the "generation" half of many architectures.

Why it matters

Every generative AI system has a decoder at its core. GPT, Claude, and Llama are decoder-only Transformers. Stable Diffusion uses a VAE decoder to produce images. Understanding decoders explains why generation is sequential (each token depends on previous tokens), why output is slower than input processing, and why the autoregressive paradigm dominates text generation.

Deep Dive

In a Transformer decoder, causal masking ensures each token can only attend to previous tokens (including itself). This is enforced by setting future positions to −∞ in the attention scores before softmax. The result: token 5's representation only depends on tokens 1–5. This constraint is what enables autoregressive generation — you can generate token 6 using only the representations from tokens 1–5, which are already computed.

Decoder-Only LLMs

Modern LLMs (GPT, Claude, Llama) are decoder-only: there's no separate encoder, and the entire model uses causal attention. The input prompt is processed through the same decoder layers as the generated output. This simplicity is why decoder-only won: one architecture, one attention pattern, clean scaling. The model treats everything as generation — even "understanding" the input is framed as predicting what comes next.

VAE Decoder in Image Generation

In Stable Diffusion, the diffusion process operates in a compressed latent space (64×64 instead of 512×512). The VAE decoder converts this latent representation back into a full-resolution image. It's a separate neural network that's trained to reconstruct images from latents. The quality of the VAE decoder directly affects the final image quality — a good decoder adds fine details and textures that the latent representation can't capture at its lower resolution.

Related Concepts

← All Terms
ESC