AI21 Labs: Definition & Meaning — AI Wiki

一家以色列 AI 公司,以 Jamba 闻名,这是第一个生产级的混合架构,把 Transformer attention 层和 Mamba SSM 层结合在一起。AI21 由 AI 研究者(包括 Yoav Shoham)创立,从 2017 年就开始构建语言模型,比 ChatGPT 还早。他们的模型通过 API 和云供应商可用。

为什么重要

AI21 Labs 重要是因为 Jamba 证明了 Transformer-SSM 混合架构在实际中工作,不只是在研究论文里。通过交错排布 attention 和 Mamba 层,Jamba 在和纯 Transformer 模型相似质量的情况下,实现了 256K 的上下文窗口,内存使用更低。这种混合方法可能是 LLM 架构的未来。

Deep Dive

Jamba's architecture interleaves Transformer blocks (with standard attention) and Mamba blocks (with selective state spaces) in a ratio of roughly 1:7 — one attention layer for every seven Mamba layers. This captures the best of both: Mamba layers handle the bulk of sequence processing efficiently (linear in sequence length), while attention layers provide the global token interaction that pure SSMs sometimes lack. The result: a model that fits in a single 80GB GPU at 256K context while matching Transformer-only models on quality.

The MoE Component

Jamba also uses Mixture of Experts (MoE), with 52B total parameters but only ~12B active per token. This combination of SSM + Attention + MoE is the most complex hybrid architecture in production and demonstrates that these techniques compose well. The 3x reduction in KV cache memory compared to a pure Transformer of equivalent quality is practically significant for serving long-context workloads.

AI21 Labs

为什么重要

Deep Dive

The MoE Component

相关概念