Zhipu AI emerged in 2019 from the Knowledge Engineering Group at Tsinghua University, one of China's most prestigious research institutions. The founders — led by CEO Zhang Peng and drawing on the work of Professor Tang Jie — had spent years building the Academic Knowledge Graph (AMiner) and other large-scale knowledge systems. They recognized early that the transformer revolution was about to make pure language models far more capable than traditional knowledge graphs, and spun out a company to commercialize that insight. This academic DNA sets Zhipu apart from China's other AI champions: while Baidu and Alibaba had vast engineering armies, Zhipu started with deep research credibility and a direct pipeline to Tsinghua's talent pool.
Zhipu's technical identity is built around the GLM (General Language Model) architecture, which differs from the standard GPT-style autoregressive approach. GLM uses an autoregressive blank-filling objective that combines the strengths of autoencoding (like BERT) and autoregressive (like GPT) pre-training in a single unified framework. ChatGLM, their conversational model, was one of the first Chinese LLMs to gain wide adoption among developers — partly because it was open-sourced early and ran well on consumer GPUs. ChatGLM-6B became something of a phenomenon in 2023, offering developers a bilingual Chinese-English model they could actually fine-tune on a single GPU. The GLM-4 generation, released in 2024, closed much of the gap with GPT-4 on Chinese-language tasks and introduced strong function-calling and long-context capabilities that made it viable for enterprise applications.
Where Zhipu really differentiates itself is in multimodal generation. CogView, their image generation model, was one of the earliest Chinese text-to-image systems to achieve competitive quality. CogVideo and its successor CogVideoX pushed into AI video generation, producing results that held up against Runway and Pika at a fraction of the cost. By 2025, CogVideoX-5B had become one of the most capable open-source video generation models available, widely used by researchers and developers who needed video generation capabilities without paying per-clip API fees. This multimodal breadth — text, image, video, and code generation under one roof — gives Zhipu an integrated platform story that few competitors can match.
Zhipu has attracted a who's-who of Chinese tech investment. A $341 million Series B in 2023 was followed by additional rounds that reportedly valued the company at over $3 billion by mid-2024. Investors include social media giant Meituan, semiconductor firm Zhongguancun Science City, and multiple state-backed funds. This is not unusual in China's AI landscape — the government's "AI+" strategy explicitly encourages state capital to flow into foundation model companies — but Zhipu's Tsinghua pedigree gives it a particular advantage in navigating Beijing's priorities. The company has been positioned as a national champion in the foundation model space, alongside Baidu's Ernie and Alibaba's Qwen, which brings both resources and expectations.
Zhipu's commercial strategy centers on their Zhipu Qingyan (BigModel) platform, which offers API access to GLM models for enterprise customers, along with fine-tuning tools and an agent-building framework. They have been particularly aggressive in the Chinese enterprise market, targeting sectors like finance, education, and government services where data sovereignty concerns make foreign AI providers a non-starter. The company also operates a consumer-facing chatbot that competes with Baidu's Ernie Bot and Alibaba's Tongyi Qianwen. For the international AI community, Zhipu matters most as a source of high-quality open-source models — CogVideoX in particular has found a global audience that extends well beyond China's borders.