AI21 Labs: Definition & Meaning — AI Wiki

Una compañía de IA israelí conocida por Jamba, la primera arquitectura híbrida de nivel producción que combina capas de atención Transformer con capas Mamba SSM. AI21 fue fundada por investigadores de IA (incluyendo Yoav Shoham) y ha estado construyendo modelos de lenguaje desde 2017, precediendo a ChatGPT. Sus modelos están disponibles vía API y a través de proveedores cloud.

Por qué importa

AI21 Labs importa porque Jamba probó que las arquitecturas híbridas Transformer-SSM funcionan en la práctica, no solo en papers de investigación. Entrelazando capas de atención y capas Mamba, Jamba logra una ventana de contexto de 256K con menor uso de memoria que modelos Transformer puros de calidad similar. Este enfoque híbrido podría ser el futuro de la arquitectura LLM.

Deep Dive

Jamba's architecture interleaves Transformer blocks (with standard attention) and Mamba blocks (with selective state spaces) in a ratio of roughly 1:7 — one attention layer for every seven Mamba layers. This captures the best of both: Mamba layers handle the bulk of sequence processing efficiently (linear in sequence length), while attention layers provide the global token interaction that pure SSMs sometimes lack. The result: a model that fits in a single 80GB GPU at 256K context while matching Transformer-only models on quality.

The MoE Component

Jamba also uses Mixture of Experts (MoE), with 52B total parameters but only ~12B active per token. This combination of SSM + Attention + MoE is the most complex hybrid architecture in production and demonstrates that these techniques compose well. The 3x reduction in KV cache memory compared to a pure Transformer of equivalent quality is practically significant for serving long-context workloads.

AI21 Labs

Por qué importa

Deep Dive

The MoE Component

Conceptos relacionados