AI21 Labs: Definition & Meaning — AI Wiki

एक Israeli AI company जो Jamba के लिए जानी जाती है, पहली production-grade hybrid architecture जो Transformer attention layers को Mamba SSM layers के साथ combine करती है। AI21 को AI researchers ने found किया (Yoav Shoham समेत) और 2017 से language models build कर रही है, ChatGPT से पहले से। उनके models API और cloud providers के through available हैं।

यह क्यों matter करता है

AI21 Labs इसलिए matter करती है क्योंकि Jamba ने prove किया कि hybrid Transformer-SSM architectures practice में work करती हैं, सिर्फ research papers में नहीं। Attention और Mamba layers को interleave करके, Jamba similar quality के pure Transformer models के मुक़ाबले lower memory usage के साथ 256K context window achieve करता है। ये hybrid approach LLM architecture का future हो सकती है।

Deep Dive

Jamba's architecture interleaves Transformer blocks (with standard attention) and Mamba blocks (with selective state spaces) in a ratio of roughly 1:7 — one attention layer for every seven Mamba layers. This captures the best of both: Mamba layers handle the bulk of sequence processing efficiently (linear in sequence length), while attention layers provide the global token interaction that pure SSMs sometimes lack. The result: a model that fits in a single 80GB GPU at 256K context while matching Transformer-only models on quality.

The MoE Component

Jamba also uses Mixture of Experts (MoE), with 52B total parameters but only ~12B active per token. This combination of SSM + Attention + MoE is the most complex hybrid architecture in production and demonstrates that these techniques compose well. The 3x reduction in KV cache memory compared to a pure Transformer of equivalent quality is practically significant for serving long-context workloads.

AI21 Labs

यह क्यों matter करता है

Deep Dive

The MoE Component

संबंधित अवधारणाएँ