Zubnet AIApprendreWiki › Vidu
Entreprises

Vidu

Aussi connu sous: Vidu video generation, long-form coherent video
Plateforme de génération vidéo de Shengshu Technology, produisant certaines des vidéos générées par IA les plus physiquement cohérentes. A gagné l'attention pour une forte qualité de mouvement et une consistance multi-shot qui rivalise avec les compétiteurs occidentaux.

Pourquoi c'est important

Vidu a démontré que les labos IA chinois pouvaient égaler la qualité de génération vidéo occidentale en quelques mois après la révélation de Sora, remodelant les hypothèses sur où la pointe en vidéo IA vit réellement. Leur focus sur la cohérence physique et la consistance multi-shot a poussé tout le champ vers l'avant, forçant les compétiteurs à prioriser le réalisme plutôt que la flair visuel. Pour le marché vidéo IA plus large, le pricing agressif de Vidu et la disponibilité API ont aussi aidé à faire baisser les coûts et augmenter l'accès pour les développeurs dans le monde entier.

Deep Dive

Vidu emerged from Shengshu Technology, a Beijing-based startup founded in 2024 by a team of researchers with deep roots in Tsinghua University's AI labs. The company's co-founder, Zhu Jun, had spent years working on generative models at Tsinghua before making the leap to commercialization. From the start, Shengshu positioned Vidu not as a general-purpose AI play but as a focused video generation engine — a bet that the next frontier in generative AI was moving pictures, not still images. Their first public demo in early 2024 turned heads in the Chinese tech press, coming just weeks after OpenAI's Sora reveal and demonstrating that Chinese labs were not far behind.

The technology

What set Vidu apart from day one was its emphasis on physical coherence. While many early video generation models produced dreamlike, fluid results that fell apart when objects interacted with each other, Vidu's outputs showed a notably better grasp of physics — objects had weight, shadows moved correctly, and camera motion felt intentional rather than random. The underlying architecture uses a diffusion transformer approach, trained on large-scale video datasets that Shengshu assembled partly through partnerships with Chinese content platforms. Their models support multi-shot generation with consistent characters, a feature that moved Vidu from a novelty tool toward something creators could actually use for short-form storytelling.

Market positioning and competition

Vidu occupies an interesting position in the AI video landscape. In China, it competes with Kling (from Kuaishou), Wan (from Alibaba), and a handful of other well-funded efforts. Internationally, it goes up against Runway, Luma, and Pika. Shengshu has pursued an API-first strategy alongside its consumer-facing product, making Vidu available to developers building on top of video generation. Pricing has been aggressive, undercutting Western competitors while offering comparable or better quality on many benchmarks. The company raised significant funding in 2024, reportedly at a valuation exceeding $300 million, with backing from Zhipu AI and other notable Chinese investors.

What comes next

Shengshu has been pushing Vidu toward longer-form generation, higher resolutions, and better controllability — the three axes that matter most for professional use. They have also invested in image-to-video and video-to-video capabilities, recognizing that most real workflows start with reference material rather than text prompts alone. The broader question for Vidu is whether it can break through internationally despite the geopolitical headwinds facing Chinese AI companies, or whether it will remain primarily a domestic powerhouse. Either way, the technical quality of their output has earned them a seat at the table in the global AI video conversation.

Concepts liés

← Tous les termes
← Video Generation Vision →
ESC