AI21 Labs: Definition & Meaning — AI Wiki

Uma empresa de IA israelense conhecida pelo Jamba, a primeira arquitetura híbrida de nível produção que combina camadas de atenção Transformer com camadas Mamba SSM. AI21 foi fundada por pesquisadores de IA (incluindo Yoav Shoham) e vem construindo modelos de linguagem desde 2017, anterior ao ChatGPT. Seus modelos estão disponíveis via API e através de provedores cloud.

Por que importa

AI21 Labs importa porque o Jamba provou que arquiteturas híbridas Transformer-SSM funcionam na prática, não só em papers de pesquisa. Intercalando camadas de atenção e camadas Mamba, o Jamba alcança uma janela de contexto de 256K com menor uso de memória que modelos Transformer puros de qualidade similar. Essa abordagem híbrida pode ser o futuro da arquitetura LLM.

Deep Dive

Jamba's architecture interleaves Transformer blocks (with standard attention) and Mamba blocks (with selective state spaces) in a ratio of roughly 1:7 — one attention layer for every seven Mamba layers. This captures the best of both: Mamba layers handle the bulk of sequence processing efficiently (linear in sequence length), while attention layers provide the global token interaction that pure SSMs sometimes lack. The result: a model that fits in a single 80GB GPU at 256K context while matching Transformer-only models on quality.

The MoE Component

Jamba also uses Mixture of Experts (MoE), with 52B total parameters but only ~12B active per token. This combination of SSM + Attention + MoE is the most complex hybrid architecture in production and demonstrates that these techniques compose well. The 3x reduction in KV cache memory compared to a pure Transformer of equivalent quality is practically significant for serving long-context workloads.

AI21 Labs

Por que importa

Deep Dive

The MoE Component

Conceitos relacionados