Vidu emerged from Shengshu Technology, a Beijing-based startup founded in 2024 by a team of researchers with deep roots in Tsinghua University's AI labs. The company's co-founder, Zhu Jun, had spent years working on generative models at Tsinghua before making the leap to commercialization. From the start, Shengshu positioned Vidu not as a general-purpose AI play but as a focused video generation engine — a bet that the next frontier in generative AI was moving pictures, not still images. Their first public demo in early 2024 turned heads in the Chinese tech press, coming just weeks after OpenAI's Sora reveal and demonstrating that Chinese labs were not far behind.
What set Vidu apart from day one was its emphasis on physical coherence. While many early video generation models produced dreamlike, fluid results that fell apart when objects interacted with each other, Vidu's outputs showed a notably better grasp of physics — objects had weight, shadows moved correctly, and camera motion felt intentional rather than random. The underlying architecture uses a diffusion transformer approach, trained on large-scale video datasets that Shengshu assembled partly through partnerships with Chinese content platforms. Their models support multi-shot generation with consistent characters, a feature that moved Vidu from a novelty tool toward something creators could actually use for short-form storytelling.
Vidu occupies an interesting position in the AI video landscape. In China, it competes with Kling (from Kuaishou), Wan (from Alibaba), and a handful of other well-funded efforts. Internationally, it goes up against Runway, Luma, and Pika. Shengshu has pursued an API-first strategy alongside its consumer-facing product, making Vidu available to developers building on top of video generation. Pricing has been aggressive, undercutting Western competitors while offering comparable or better quality on many benchmarks. The company raised significant funding in 2024, reportedly at a valuation exceeding $300 million, with backing from Zhipu AI and other notable Chinese investors.
Shengshu has been pushing Vidu toward longer-form generation, higher resolutions, and better controllability — the three axes that matter most for professional use. They have also invested in image-to-video and video-to-video capabilities, recognizing that most real workflows start with reference material rather than text prompts alone. The broader question for Vidu is whether it can break through internationally despite the geopolitical headwinds facing Chinese AI companies, or whether it will remain primarily a domestic powerhouse. Either way, the technical quality of their output has earned them a seat at the table in the global AI video conversation.