Zubnet AI学习Wiki › Decoder
基础

Decoder

Decoder Network, Generator
一个神经网络组件,从一个表示生成输出。在 Transformer 中,decoder 使用因果(从左到右)attention,一次生成一个 token。在图像生成中,VAE decoder 把潜在表示转换回图像。在 autoencoder 中,decoder 从压缩的瓶颈重建原始输入。Decoder 是很多架构中“生成”那一半。

为什么重要

每个生成式 AI 系统的核心都有一个 decoder。GPT、Claude 和 Llama 都是 decoder-only Transformer。Stable Diffusion 用 VAE decoder 生成图像。理解 decoder 能解释为什么生成是顺序的(每个 token 依赖于前面的 token)、为什么输出比输入处理慢、以及为什么自回归范式主导文本生成。

Deep Dive

In a Transformer decoder, causal masking ensures each token can only attend to previous tokens (including itself). This is enforced by setting future positions to −∞ in the attention scores before softmax. The result: token 5's representation only depends on tokens 1–5. This constraint is what enables autoregressive generation — you can generate token 6 using only the representations from tokens 1–5, which are already computed.

Decoder-Only LLMs

Modern LLMs (GPT, Claude, Llama) are decoder-only: there's no separate encoder, and the entire model uses causal attention. The input prompt is processed through the same decoder layers as the generated output. This simplicity is why decoder-only won: one architecture, one attention pattern, clean scaling. The model treats everything as generation — even "understanding" the input is framed as predicting what comes next.

VAE Decoder in Image Generation

In Stable Diffusion, the diffusion process operates in a compressed latent space (64×64 instead of 512×512). The VAE decoder converts this latent representation back into a full-resolution image. It's a separate neural network that's trained to reconstruct images from latents. The quality of the VAE decoder directly affects the final image quality — a good decoder adds fine details and textures that the latent representation can't capture at its lower resolution.

相关概念

← 所有术语
ESC