Zubnet AI學習Wiki › Vidu
公司

Vidu

又名: Vidu video generation, long-form coherent video
生數科技的影片生成平台,產出最物理連貫的 AI 生成影片之一。以強大的運動品質和能與西方競爭者匹敵的多鏡頭一致性獲得關注。

為什麼重要

Vidu 展示了中國 AI 實驗室能在 Sora 揭示後幾個月內追上西方影片生成品質,重塑了「AI 影片的前沿實際上在哪裡」的假設。他們對物理連貫性和多鏡頭一致性的聚焦推動整個領域向前,迫使競爭者把現實主義優先於視覺花俏。對更廣泛的 AI 影片市場,Vidu 的激進定價和 API 可用性也幫助降低成本、增加全球開發者的存取。

Deep Dive

Vidu emerged from Shengshu Technology, a Beijing-based startup founded in 2024 by a team of researchers with deep roots in Tsinghua University's AI labs. The company's co-founder, Zhu Jun, had spent years working on generative models at Tsinghua before making the leap to commercialization. From the start, Shengshu positioned Vidu not as a general-purpose AI play but as a focused video generation engine — a bet that the next frontier in generative AI was moving pictures, not still images. Their first public demo in early 2024 turned heads in the Chinese tech press, coming just weeks after OpenAI's Sora reveal and demonstrating that Chinese labs were not far behind.

The technology

What set Vidu apart from day one was its emphasis on physical coherence. While many early video generation models produced dreamlike, fluid results that fell apart when objects interacted with each other, Vidu's outputs showed a notably better grasp of physics — objects had weight, shadows moved correctly, and camera motion felt intentional rather than random. The underlying architecture uses a diffusion transformer approach, trained on large-scale video datasets that Shengshu assembled partly through partnerships with Chinese content platforms. Their models support multi-shot generation with consistent characters, a feature that moved Vidu from a novelty tool toward something creators could actually use for short-form storytelling.

Market positioning and competition

Vidu occupies an interesting position in the AI video landscape. In China, it competes with Kling (from Kuaishou), Wan (from Alibaba), and a handful of other well-funded efforts. Internationally, it goes up against Runway, Luma, and Pika. Shengshu has pursued an API-first strategy alongside its consumer-facing product, making Vidu available to developers building on top of video generation. Pricing has been aggressive, undercutting Western competitors while offering comparable or better quality on many benchmarks. The company raised significant funding in 2024, reportedly at a valuation exceeding $300 million, with backing from Zhipu AI and other notable Chinese investors.

What comes next

Shengshu has been pushing Vidu toward longer-form generation, higher resolutions, and better controllability — the three axes that matter most for professional use. They have also invested in image-to-video and video-to-video capabilities, recognizing that most real workflows start with reference material rather than text prompts alone. The broader question for Vidu is whether it can break through internationally despite the geopolitical headwinds facing Chinese AI companies, or whether it will remain primarily a domestic powerhouse. Either way, the technical quality of their output has earned them a seat at the table in the global AI video conversation.

相關概念

← 所有術語
← Video Generation Vision →
ESC