Microsoft Unveils VASA-1, an Image-to-Video AI Model That Generates Eerily Realistic Results

Microsoft’s VASA-1 AI video model can create videos with just one photo and an audio file.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 18 April 2024 14:06 IST
Highlights
  • VASA-1 generates videos of talking faces with realistic lip-syncing
  • Microsoft said it intends to create virtual characters using the AI model
  • The company does not plan to release a product or API with VASA-1

Microsoft’s VASA-1 model can generate videos in 512 x 512p resolution at up to 40 FPS

Photo Credit: Microsoft

Microsoft has introduced a new artificial intelligence (AI) model that can generate hyper-realistic videos of talking human faces. Dubbed VASA-1, the AI image-to-video model can generate videos from just one photo and a speech audio clip. The company says the created videos will have synchronised lip movements to match the audio as well as facial expressions and head movement to make it appear natural. Notably, the tech giant does not intend to release a product or API with the VASA-1 model and claims that it will be used to create realistic virtual characters.

In a post on its Research announcement page, Microsoft detailed the workings of its under-development AI model and highlighted its capabilities. The company claims that the VASA-1 model can generate videos of 512 x 512p resolution at up to 40 FPS. The AI model is also said to support online video generation with negligible starting latency. X (formerly known as Twitter) user Kaio Ken shared a video of the AI model in action.

While the biggest achievement of VASA-1 is to render up to one-minute-long videos (as per the demos) in high quality with a single static image, the company also highlighted its ability to generate lip movements that match the audio file and facial expressions to go along with it. The AI video generation model also offers granular control to the user to control different aspects of the video such as main eye gaze direction, head distance, emotion offsets, and more. These attribution controls over disentangled appearance, 3D head pose, and facial dynamics can help modify the output closely as per the user's directions.

Advertisement

Further, the AI model was also able to generate videos using artistic photos, singing audio, and non-English speech. Microsoft researchers point out that the capability for these functionalities was not present in its data, hinting at its self-learning ability.

Advertisement

The AI model's hyperrealistic video generation of real people with any audio is impressive, but it also raises a question about its unethical usage, especially to create deepfakes. The company highlighted that it does not intend to release the AI model to the public and wants to create virtual interactive characters using it.

Microsoft also said that this technique can be used for advancing forgery detection. “While acknowledging the possibility of misuse, it's imperative to recognize the substantial positive potential of our technique. The benefits – ranging from enhancing educational equity, improving accessibility for individuals with communication challenges, and offering companionship or therapeutic support to those in need – underscore the importance of our research and other related explorations. We are dedicated to developing AI responsibly, with the goal of advancing human well-being,” the company added.


Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.
Affiliate links may be automatically generated - see our ethics statement for details.
 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. RAM Crisis 2026: 16GB Phones Out, 4GB Models Making a Comeback
  2. OnePlus 15R Storage Options Leaked: Here's How Much It Might Cost in India
  3. MacBook Air (2025) With M4 Chip Available at This Discounted Price
  4. Pixel 10 Series Gets Price Cuts During Google's End of Year Sale: See Offers
  5. Lenovo Idea Tab Plus Launched in India With 10,200mah Battery: Details
  6. Realme Narzo 90, Realme Narzo 90x Launching Today: All You Need to Know
  7. Vivo S50, S50 Pro Mini With Snapdragon Chips Launched at These Prices
  8. Oppo Reno 15c With Snapdragon 7 Gen 4 SoC Launched at This Price
  9. Logitech MX Master 4 Launches in India With These Features
  10. Mrs Deshpande OTT Release Date: Madhuri Dixit's Starrere to Premiere on This Date
  1. Realme Narzo 90, Realme Narzo 90x 5G Launching Today: Know Price in India, Features, Specifications and More
  2. Webb Telescope Discovers Hidden Atmosphere on Molten Super-Earth TOI-561 b Despite Extreme Heat
  3. Astronomers Watch a Dormant Neutron Star Reignite After a Decade of Silence
  4. Predictive Forecasting Tools Can Boost the Success of Clean Energy Investments Worldwide
  5. Chinese Spacecraft Nearly Slammed Into Starlink Satellite, SpaceX Reveals
  6. Clocks on Mars Run Faster Than on Earth, New Study Finds
  7. The Hunting Wives Out on OTT: Know Everything About This American Thriller Mystery Series
  8. All Her Fault Now Streaming on JioHotstar: Know Everything About This Thriller Series
  9. Wednesday Season 3 Set for July 2027 on Netflix: Jenna Ortega Returns as the Iconic Addams Heir
  10. Lakshmi Manchu’s Daksha: The Deadly Conspiracy Available for Streaming on Amazon Prime Video
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.