Alibaba Releases Open-Source Wan 2.1 Suite of AI Video Generation Models, Claimed to Outperform OpenAI’s Sora

Alibaba’s Wan 2.1 T2V-1.3B video model can generate a 5-second 480p video using the Nvidia RTX 4090 in four minutes.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 27 February 2025 16:18 IST
Highlights
  • Alibaba’s Wan 2.1 supports Chinese and English text prompts
  • It can generate videos using both text and image inputs
  • The team used a new 3D causal VAE architecture for the models

The open-source Wan 2.1 video models are available with the Apache 2.0 license

Photo Credit: Reuters

Alibaba released a suite of artificial intelligence (AI) video generation models on Wednesday. Dubbed Wan 2.1, these are open-source models that can be used for both academic and commercial purposes. The Chinese e-commerce giant released the models in several parameter-based variants. Developed by the company's Wan team, these models were first introduced in January and the company claimed that Wan 2.1 can generate highly realistic videos. Currently, these models are being hosted on the AI and machine learning (ML) hub Hugging Face.

Alibaba Introduces Wan 2.1 Video Generation Models

The new Alibaba video AI models are hosted on Alibaba's Wan team's Hugging Face page. The model pages also detail the Wan 2.1 suite of large language models (LLMs). There are four models in total — T2V-1.3B, T2V-14B, I2V-14B-720P, and I2V-14B-480P. The T2V is short for text-to-video while the I2V stands for image-to-video.

Advertisement

The researchers claim that the smallest variant, Wan 2.1 T2V-1.3B, can be run on a consumer-grade GPU with as little as 8.19GB vRAM. As per the post, the AI model can generate a five-second-long video with 480p resolution using an Nvidia RTX 4090 in about four minutes.

While the Wan 2.1 suite is aimed at video generation, they can also perform other functions such as image generation, video-to-audio generation, and video editing. However, the currently open-sourced models are not capable of these advanced tasks. For video generation, it accepts text prompts in Chinese and English languages as well as image inputs.

Advertisement

Coming to the architecture, the researchers revealed that the Wan 2.1 models are designed using a diffusion transformer architecture. However, the company innovated the base architecture with new variational autoencoders (VAE), training strategies, and more.

Most notably, the AI models use a new 3D causal VAE architecture dubbed Wan-VAE. It improves spatiotemporal compression and reduces memory usage. The autoencoder can encode and decode unlimited-length 1080p resolution videos without losing historical temporal information. This enables consistent video generation.

Advertisement

Based on internal testing, the company claimed that the Wan 2.1 models outperform OpenAI's Sora AI model in consistency, scene generation quality, single object accuracy, and spatial positioning.

These models are available under the Apache 2.0 licence. While it does allow for unrestricted usage for academic and research purposes, commercial usage comes with multiple restrictions.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Motorola Edge 2026 With 6.3-Inch Display Goes Official
  2.  Xiaomi 18, 18 Pro and 18 Pro Max Specifications Leaked Ahead of Debut
  3. God of War Laufey Revealed at State of Play: Everything You Need to Know
  4. Lava Bold N2 5G Launched in India With 6,000mAh Battery, 6.75-Inch Display
  5. Realme P4R 5G India Launch Date, Design and Key Specifications Revealed
  6. Lumio Launches 55-Inch Variants of Vision 9 (2026), Vision 7 (2026) in India
  7. Honor X7e With a 7,500mAh Battery Debuts Globally at This Price
  8. Samsung Galaxy A27 Reportedly Bags US FCC Certification, May Launch Soon
  1. UK's FCA Warns Premier League Clubs Over Unauthorised Crypto Sponsor Risks
  2. Vivo X500 Pro Max Display and Battery Details Surface Online in Early Leak; Largest Model Said to Feature 6.85-Inch Screen
  3. Google Introduces Fake Call Detection for Android Phones to Curb Call Spoofing Attacks
  4. Google Rolls Out Gemini Thinking Levels Across Platforms With 'Extended' Thinking Mode for All Users
  5. Samsung Galaxy A27 Reportedly Bags US FCC Certification Ahead of Anticipated Launch
  6. NYDFS, European Banking Authority Join Forces to Oversee, Monitor Stablecoin Activities
  7. Meta Reportedly Testing ‘Series’ Feature to Organise Instagram, Facebook Reels Into Episodic Collections
  8. Xiaomi 18 Tipped to Sport 6.4-Inch Display; Pro Models Said to Feature Dual 200-Megapixel Rear Cameras
  9. Realme P4R 5G India Launch Date Revealed Along With Design and Key Specifications
  10. Marvel's Wolverine Gets Visceral Gameplay Trailer at State of Play, Pre-Orders Now Live
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.