Liquid AI: Definition & Meaning — AI Wiki

MIT 的 spinout,探索受生物神經迴路啟發的根本不同的神經網路架構。他們的 Liquid Foundation Models 使用連續時間動力學而非固定權重的 Transformer,承諾更好的效率和適應性。

為什麼重要

Liquid AI 代表對「Transformer 是唯一重要架構」這一假設資金最雄厚的嚴肅挑戰。透過在生物啟發的連續時間動力學上建構生產級基礎模型,他們在測試 AI 產業全押注意力機制是不是太早了。即便 LFM 沒有直接把 Transformer 拉下王座,它們在邊緣部署和長序列處理上的效率優勢可能在機器人、行動 AI、嵌入式系統裡刻出關鍵細分 — 這些市場裡跑一個 70B Transformer 根本不是選項。

Deep Dive

Liquid AI grew out of research at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL), specifically from the work of Ramin Hasani, Mathias Lechner, and Daniela Rus. Hasani and Lechner had been studying the nervous systems of C. elegans — a tiny roundworm with exactly 302 neurons — and discovered that the mathematical equations governing these biological neural circuits could be adapted into a new kind of artificial neural network. Unlike standard networks where connection weights are fixed after training, these "liquid" networks use continuous-time differential equations that allow parameters to adapt dynamically based on input. The company was formally founded in 2023 and quickly raised over $250 million at a $2 billion-plus valuation, with backing from AMD Ventures and other investors who saw the potential for an architecture that fundamentally breaks from the transformer paradigm.

Liquid Foundation Models: A Different Bet

Liquid AI's core product line — the Liquid Foundation Models (LFMs) — launched with three sizes: LFM-1B, LFM-3B, and LFM-40B. What makes these architecturally distinct is that they are not transformers, and they are not state-space models in the Mamba sense either. LFMs use a hybrid approach combining structured state-space layers with attention-like mechanisms, but the underlying mathematics is rooted in those continuous-time dynamics from the biological research. In practice, this means LFMs can handle very long sequences more efficiently than standard transformers — their memory footprint doesn't blow up quadratically with sequence length. The LFM-1B model, in particular, attracted attention for outperforming several transformer-based models of similar size on standard benchmarks, suggesting that the architectural differences translated into real capability gains rather than just theoretical elegance.

Edge AI and the Efficiency Argument

One of Liquid AI's most compelling claims is efficiency at the edge. Because liquid networks can represent complex dynamics with fewer parameters than transformers, they're naturally suited for deployment on devices with limited compute — phones, robots, IoT sensors, autonomous vehicles. The company has been explicit about targeting these use cases, positioning themselves not as another chatbot company but as the architecture provider for AI that runs everywhere. This is a fundamentally different market from the cloud-first approach of most AI labs. If your model can run meaningfully on a phone's neural processing unit without constant server calls, you unlock applications that are impossible with cloud-dependent AI: real-time robotics, offline processing, privacy-preserving inference on device. Liquid AI partnered with Qualcomm and other hardware vendors to optimize their models for specific chip architectures, a move that signals serious intent about the edge deployment story.

The Architecture Diversity Thesis

Liquid AI's existence is a bet on architectural diversity — the idea that transformers, despite their dominance, are not the final word in neural network design. This thesis has gained credibility as the limitations of transformers have become clearer: quadratic attention costs, difficulty with very long sequences, massive energy consumption during inference. The state-space model community (Mamba, RWKV, and others) has already proven that competitive alternatives exist; Liquid AI pushes further by arguing that biologically inspired dynamics offer advantages that even SSMs miss, particularly in temporal reasoning and adaptive behavior. Whether this is true at frontier scale remains unproven — the LFM-40B is competitive but not dominant against the best transformer models of comparable size — but the theoretical foundations are rigorous enough that the AI research community takes the work seriously.

Challenges and Skepticism

The obvious risk for Liquid AI is that the transformer ecosystem is enormously entrenched. The software stack (PyTorch, CUDA kernels, inference servers) is overwhelmingly optimized for transformer architectures. Every major cloud provider has spent billions building infrastructure tuned for attention-based models. Switching to a fundamentally different architecture means rebuilding tooling, retraining engineers, and convincing customers that the efficiency gains justify the transition costs. Liquid AI has addressed this partially by providing drop-in API compatibility — from the user's perspective, calling an LFM looks identical to calling any other model. But the deeper challenge is whether they can demonstrate a clear, sustained advantage at the scales that matter for enterprise adoption. With $250 million in funding and strong academic credentials, they have more runway than most architecture challengers. The next year will determine whether liquid neural networks become a real force in production AI or remain one of the most intellectually interesting footnotes in the field's history.

Liquid AI