Microsoft Unveils VASA-1, an Image-to-Video AI Model That Generates Eerily Realistic Results

Microsoft’s VASA-1 AI video model can create videos with just one photo and an audio file.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 18 April 2024 14:06 IST
Highlights
  • VASA-1 generates videos of talking faces with realistic lip-syncing
  • Microsoft said it intends to create virtual characters using the AI model
  • The company does not plan to release a product or API with VASA-1

Microsoft’s VASA-1 model can generate videos in 512 x 512p resolution at up to 40 FPS

Photo Credit: Microsoft

Microsoft has introduced a new artificial intelligence (AI) model that can generate hyper-realistic videos of talking human faces. Dubbed VASA-1, the AI image-to-video model can generate videos from just one photo and a speech audio clip. The company says the created videos will have synchronised lip movements to match the audio as well as facial expressions and head movement to make it appear natural. Notably, the tech giant does not intend to release a product or API with the VASA-1 model and claims that it will be used to create realistic virtual characters.

In a post on its Research announcement page, Microsoft detailed the workings of its under-development AI model and highlighted its capabilities. The company claims that the VASA-1 model can generate videos of 512 x 512p resolution at up to 40 FPS. The AI model is also said to support online video generation with negligible starting latency. X (formerly known as Twitter) user Kaio Ken shared a video of the AI model in action.

While the biggest achievement of VASA-1 is to render up to one-minute-long videos (as per the demos) in high quality with a single static image, the company also highlighted its ability to generate lip movements that match the audio file and facial expressions to go along with it. The AI video generation model also offers granular control to the user to control different aspects of the video such as main eye gaze direction, head distance, emotion offsets, and more. These attribution controls over disentangled appearance, 3D head pose, and facial dynamics can help modify the output closely as per the user's directions.

Advertisement

Further, the AI model was also able to generate videos using artistic photos, singing audio, and non-English speech. Microsoft researchers point out that the capability for these functionalities was not present in its data, hinting at its self-learning ability.

Advertisement

The AI model's hyperrealistic video generation of real people with any audio is impressive, but it also raises a question about its unethical usage, especially to create deepfakes. The company highlighted that it does not intend to release the AI model to the public and wants to create virtual interactive characters using it.

Microsoft also said that this technique can be used for advancing forgery detection. “While acknowledging the possibility of misuse, it's imperative to recognize the substantial positive potential of our technique. The benefits – ranging from enhancing educational equity, improving accessibility for individuals with communication challenges, and offering companionship or therapeutic support to those in need – underscore the importance of our research and other related explorations. We are dedicated to developing AI responsibly, with the goal of advancing human well-being,” the company added.


Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.
Affiliate links may be automatically generated - see our ethics statement for details.
 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Oppo K14x India Launch Date, Key Features Confirmed Ahead of Debut
  2. Here's How Much the Samsung Galaxy F70e Could Cost in India
  3. Xiaomi 17 Series Could Launch in Global Markets Before MWC 2026
  4. Here's When Apple's Refreshed MacBook Pro Models Might Launch
  5. Samsung Galaxy S26, Galaxy S26+ Renders Leak Ahead of Launch
  6. Google Is Cracking Down YouTube Background Playback on These Browsers
  1. Nothing Headphone (a) Price, Colour Options, and Launch Timeline Leaked
  2. Parking Now Streaming on JioHotstar: What You Need to Know
  3. Bye Bai Bye Season 1 Now Streaming Online: What You Need to Know
  4. iQOO 15 Ultra Camera Specifications, Features Confirmed Ahead of February 4 Launch
  5. Xiaomi 17, Xiaomi 17 Ultra to Make Global Debut Before MWC 2026 in March, Tipster Claims
  6. Anthropic Says AI Chatbots Can Change Values and Beliefs of Heavy Users
  7. Oppo K14x India Launch Date Announced; Company Confirms Chipset and Other Key Features
  8. Oppo Reno 15c 5G Sale Date Revealed as Pre-Orders Begin a Month After Launch: Price, Features
  9. Apple Updates MacBook Shopping Flow With Ability to Configure Chip, Display Size, and More
  10. Redmi A7 Pro Listed on Various Certification Databases Along With Key Specifications
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.