Microsoft Unveils VASA-1, an Image-to-Video AI Model That Generates Eerily Realistic Results

Microsoft’s VASA-1 AI video model can create videos with just one photo and an audio file.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 18 April 2024 14:06 IST
Highlights
  • VASA-1 generates videos of talking faces with realistic lip-syncing
  • Microsoft said it intends to create virtual characters using the AI model
  • The company does not plan to release a product or API with VASA-1

Microsoft’s VASA-1 model can generate videos in 512 x 512p resolution at up to 40 FPS

Photo Credit: Microsoft

Microsoft has introduced a new artificial intelligence (AI) model that can generate hyper-realistic videos of talking human faces. Dubbed VASA-1, the AI image-to-video model can generate videos from just one photo and a speech audio clip. The company says the created videos will have synchronised lip movements to match the audio as well as facial expressions and head movement to make it appear natural. Notably, the tech giant does not intend to release a product or API with the VASA-1 model and claims that it will be used to create realistic virtual characters.

In a post on its Research announcement page, Microsoft detailed the workings of its under-development AI model and highlighted its capabilities. The company claims that the VASA-1 model can generate videos of 512 x 512p resolution at up to 40 FPS. The AI model is also said to support online video generation with negligible starting latency. X (formerly known as Twitter) user Kaio Ken shared a video of the AI model in action.

Advertisement

While the biggest achievement of VASA-1 is to render up to one-minute-long videos (as per the demos) in high quality with a single static image, the company also highlighted its ability to generate lip movements that match the audio file and facial expressions to go along with it. The AI video generation model also offers granular control to the user to control different aspects of the video such as main eye gaze direction, head distance, emotion offsets, and more. These attribution controls over disentangled appearance, 3D head pose, and facial dynamics can help modify the output closely as per the user's directions.

Further, the AI model was also able to generate videos using artistic photos, singing audio, and non-English speech. Microsoft researchers point out that the capability for these functionalities was not present in its data, hinting at its self-learning ability.

Advertisement

The AI model's hyperrealistic video generation of real people with any audio is impressive, but it also raises a question about its unethical usage, especially to create deepfakes. The company highlighted that it does not intend to release the AI model to the public and wants to create virtual interactive characters using it.

Microsoft also said that this technique can be used for advancing forgery detection. “While acknowledging the possibility of misuse, it's imperative to recognize the substantial positive potential of our technique. The benefits – ranging from enhancing educational equity, improving accessibility for individuals with communication challenges, and offering companionship or therapeutic support to those in need – underscore the importance of our research and other related explorations. We are dedicated to developing AI responsibly, with the goal of advancing human well-being,” the company added.


Is the Samsung Galaxy Z Flip 5 the best foldable phone you can buy in India right now? We discuss the company's new clamshell-style foldable handset on the latest episode of Orbital, the Gadgets 360 podcast. Orbital is available on Spotify, Gaana, JioSaavn, Google Podcasts, Apple Podcasts, Amazon Music and wherever you get your podcasts.
Affiliate links may be automatically generated - see our ethics statement for details.
 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. OTT Releases This Week: Border 2, Peaky Blinders: The Immortal Man, Chiraiya, and More
  2. Realme P4 Lite 5G Launched in India With These Specifications
  3. OnePlus 15T Will be Launched With These Two Gaming-Focused Features
  4. You Can Now Simply Tap to Pause Reels on Instagram
  5. OnePlus Nord Buds 4 Pro Launched in India With ANC, Up to 54 Hours of Total Playback Time
  6. Huawei MatePad SE 11 Set to Launch at This Price in India
  7. Here Are the Best Laser Printers for Home Printing Needs
  8. iQOO Z11, iQOO Z11x to Launch in China On This Date
  9. Realme C100i Surfaces on Certification Site as Key Features Surface Online
  1. Meta’s New Facebook Initiative Offers TikTok, YouTube Creators Increased Reach and Guaranteed Pay
  2. Instagram Rolls Out Tap-to-Pause Feature for Reels With More Control Over Playback
  3. Seetha Payanam Now Streaming on OTT: Where to Watch Arjun Sarja’s Romantic Road Trip Drama
  4. Circle Urges UK to Blend MiCA Clarity With US Stablecoin Rules
  5. OnePlus 15T Confirmed to Launch With Next-Gen Gaming Kernel, Same G2 Wi-Fi Chip as OnePlus 15
  6. OnePlus Watch 4 Reportedly Visits Certification Database Hinting at an Imminent Launch
  7. Lenovo Legion Y700 Gen 5 Gaming Tablet Launched With Snapdragon 8 Elite Gen 5 SoC, 9,000mAh Battery: Price, Features
  8. Kaattaan OTT Release Date Revealed: Know When and Where to Watch Vijay Sethupathi’s Upcoming Thriller Series
  9. Google Pixel Users Report Freezing Issues on Lock Screen, Always-On Display Following March Update
  10. Rare iPhone Spyware Can Infect Devices With a Single Website Visit, Researchers Say
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.