Wan-AI is not an independent startup — it is Alibaba's dedicated push into video generation, operating under the Tongyi (formerly DAMO Academy) research umbrella in Hangzhou. The initiative launched in 2024 as Alibaba recognized that open-weights video models could do for video generation what Qwen had done for large language models: establish Alibaba as the go-to provider for developers who want state-of-the-art capabilities without vendor lock-in. The Wan models were released on Hugging Face and ModelScope with permissive licenses, instantly making them some of the most accessible high-quality video generation models available anywhere.
Alibaba's decision to release Wan as open-weights was strategic, not charitable. By making powerful video models freely available, they created an ecosystem of developers, researchers, and businesses building on Alibaba's technology stack. This drives traffic to Alibaba Cloud, increases mindshare in the developer community, and positions Alibaba as the default infrastructure provider for video AI workloads across Asia and beyond. The Wan models came in multiple sizes — from lightweight versions that can run on consumer GPUs to larger variants that rival the best closed-source offerings — giving developers the flexibility to choose based on their compute budget and quality requirements.
The Wan model family uses a diffusion transformer architecture with a text encoder derived from Alibaba's Qwen language models, creating a tight integration between text understanding and visual generation. The results are particularly strong in prompt adherence and scene composition, areas where many video models struggle. Wan supports text-to-video, image-to-video, and video-to-video generation, and the open-weights nature means the community has rapidly built LoRA fine-tunes, custom workflows in ComfyUI, and specialized adaptations for everything from anime to architectural visualization. This ecosystem effect is arguably more valuable than the base model itself.
Wan sits at the intersection of two competitive battles. In the open-weights video space, it competes with Stability AI's video models and various community efforts. In the broader Chinese AI video market, it competes with Kling, Vidu, and others — though Alibaba's approach is fundamentally different because the model is the marketing, not the product. The real product is Alibaba Cloud compute. This positioning means Wan can afford to be more generous with model releases than standalone startups that need to monetize the model directly, giving it a structural advantage in the open-source race that is difficult for smaller players to match.