Microsoft Unveils VASA-1, an Image-to-Video AI Model That Generates Eerily Realistic Results

Microsoft’s VASA-1 AI video model can create videos with just one photo and an audio file.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 18 April 2024 14:06 IST
Highlights
  • VASA-1 generates videos of talking faces with realistic lip-syncing
  • Microsoft said it intends to create virtual characters using the AI model
  • The company does not plan to release a product or API with VASA-1

Microsoft’s VASA-1 model can generate videos in 512 x 512p resolution at up to 40 FPS

Photo Credit: Microsoft

Microsoft has introduced a new artificial intelligence (AI) model that can generate hyper-realistic videos of talking human faces. Dubbed VASA-1, the AI image-to-video model can generate videos from just one photo and a speech audio clip. The company says the created videos will have synchronised lip movements to match the audio as well as facial expressions and head movement to make it appear natural. Notably, the tech giant does not intend to release a product or API with the VASA-1 model and claims that it will be used to create realistic virtual characters.

In a post on its Research announcement page, Microsoft detailed the workings of its under-development AI model and highlighted its capabilities. The company claims that the VASA-1 model can generate videos of 512 x 512p resolution at up to 40 FPS. The AI model is also said to support online video generation with negligible starting latency. X (formerly known as Twitter) user Kaio Ken shared a video of the AI model in action.

While the biggest achievement of VASA-1 is to render up to one-minute-long videos (as per the demos) in high quality with a single static image, the company also highlighted its ability to generate lip movements that match the audio file and facial expressions to go along with it. The AI video generation model also offers granular control to the user to control different aspects of the video such as main eye gaze direction, head distance, emotion offsets, and more. These attribution controls over disentangled appearance, 3D head pose, and facial dynamics can help modify the output closely as per the user's directions.

Advertisement

Further, the AI model was also able to generate videos using artistic photos, singing audio, and non-English speech. Microsoft researchers point out that the capability for these functionalities was not present in its data, hinting at its self-learning ability.

Advertisement

The AI model's hyperrealistic video generation of real people with any audio is impressive, but it also raises a question about its unethical usage, especially to create deepfakes. The company highlighted that it does not intend to release the AI model to the public and wants to create virtual interactive characters using it.

Microsoft also said that this technique can be used for advancing forgery detection. “While acknowledging the possibility of misuse, it's imperative to recognize the substantial positive potential of our technique. The benefits – ranging from enhancing educational equity, improving accessibility for individuals with communication challenges, and offering companionship or therapeutic support to those in need – underscore the importance of our research and other related explorations. We are dedicated to developing AI responsibly, with the goal of advancing human well-being,” the company added.


Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.
Affiliate links may be automatically generated - see our ethics statement for details.
 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Top OTT Releases of the Week: Kantara Chapter 1, Lokah Chapter 1, Idli Kadai, and More
  2. Realme GT 8 Pro India Launch Date Leaked: Here's When It Might Arrive
  3. iQOO Neo 11 With Snapdragon 8 Elite SoC Launched: Price, Specifications
  4. iQOO 15 Indian Variant Allegedly Surfaces on Geekbench Ahead of Launch
  5. You Can Now Repair the iPhone 17 Series, iPhone Air Yourself in These Regions
  6. WhatsApp Will Soon Let You Reply, React to Messages Using Your Apple Watch
  7. Samsung Galaxy S26 Series Teased to Launch With These Notable Upgrades
  8. You Can Now Protect WhatsApp Chat Backups With Passkey Encryption
  9. Vivo X300 Series With 200-Megapixel Zeiss Camera Launched Globally
  10. How to Claim 18 Months of Free Google AI Pro Access on the MyJio App
  1. Samsung Galaxy Book 6 Pro Allegedly Listed on Geekbench With Intel Core Ultra 5 SoC, 32GB of RAM
  2. OpenAI Tells Users to Pay for Extra AI Video Generations on the Sora App
  3. WhatsApp Tests Companion App for Apple Watch With Core Messaging Functionality
  4. Samsung Internet Browser Beta for Windows PCs Launched with Galaxy AI Integration
  5. WhatsApp Announces Passkey-Encrypted Chat Backups With Biometric Authentication for Extra Security
  6. Apple CEO Tim Cook Forecasts Holiday Quarter iPhone Sales That Top Wall Street Estimates
  7. Realme GT 8 Pro India Launch Date Tipped After Company Confirms November Debut
  8. iPhone 17 Series, iPhone Air Join Apple’s Self Service Repair Programme Across US, Canada and Europe
  9. Google, Magic Leap Show Off New Android XR Glasses Prototype With In-Lens Display
  10. iQOO 15 Indian Variant Allegedly Surfaces on Geekbench With Snapdragon 8 Elite Gen 5 Chipset
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.