Microsoft Releases New AI Models That Can Generate Images, Audio and Transcribe Text

Microsoft has released MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 AI models.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 3 April 2026 18:44 IST
Highlights
  • These models are available via Microsoft Foundry and MAI Playground
  • MAI-Transcribe-1 is said to outperform Google and OpenAI’s models
  • Voice-1 can generate realistic speech with an emotional range

The Image-2 model is being rolled out to Copilot, Bing, and PowerPoint

Photo Credit: Microsoft

Microsoft released three specialised artificial intelligence (AI) models on Thursday, focusing on image generation, voice generation, and speech-to-text transcription. The Redmond-based tech giant claims that these models outperform specialised models from rival companies, such as Google, OpenAI, and others. The models, MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2, are also said to focus on fast generation and competitive pricing. These are currently available via the Microsoft Foundry, and they are also being rolled out to various consumer products.

Microsoft Brings Three New AI Models

In a newsroom post, the tech giant introduced the three new large language models (LLMs). All of them are currently available via Microsoft Foundry and the MAI Playground. The biggest highlight is the MAI-Transcribe-1, which the company claims delivers state-of-the-art (SOTA) speech-to-text transcription across the 25 most used languages.

Advertisement

The claims are based on Microsoft's internal testing on the FLEURS benchmark. It is said to outperform Gemini 3.1 Flash and GPT-Transcribe in error rate. Additionally, the company says Foundry users will find it to be the “best price-performance of any large cloud provider.”

Coming to MAI-Voice-1, the LLM is said to generate “natural, realistic speech, rich with nuance, emotional range, and expression.” The model is also said to deliver consistent speech and voice identity during long-form content generation. Inside Foundry, the model will also allow users to create a custom voice with a few seconds of audio.

Advertisement

Microsoft claims that this process is safe and secure. It is said to generate 60 seconds of audio in a single second. Notably, the AI model will also power Copilot Audio Expressions and Copilot Podcasts.

Finally, the MAI-Image-2 model builds on the capabilities of its predecessor and is said to deliver improved output quality at a faster speed. Microsoft revealed that the model was created in collaboration with photographers, designers, and visual storytellers, and it focuses on natural lighting, accurate textures, and clear in-image text. Notably, WPP is among the first enterprise partners to have adopted the AI model.

Advertisement

The model, similar to the other two, will be available via the Microsoft Foundry and the MAI Playground. Additionally, it is also rolling out to Copilot, Bing, and PowerPoint.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement

Related Stories

Popular Mobile Brands
  1. Here's When the Xiaomi 17T Series Will Launch Globally
  2. YouTube's Likeness Detection Tool Is Now Available to All Adult Creators
  1. SpaceX Dragon Capsule Reaches ISS Carrying 6,500 Pounds of Supplies
  2. YouTube’s Likeness Detection Tool Is Now Available to All Adult Creators
  3. Vi Postpaid Users in India Can Choose New International Roaming Plans From Rs 649
  4. Red Magic 11S Pro, Red Magic 11S Pro+ Launched With Snapdragon 8 Elite Gen 5 Leading Edition SoC: Price, Specifications
  5. Satrangi: Badle Ka Khel OTT Release Date Revealed: Know Everything About Plot, Cast, and More
  6. Prasanth Pandiyaraj’s Warrant OTT Release Details Revealed: Know When and Where to Watch it Online
  7. Realme 16T 5G Camera Specifications Confirmed Ahead of May 22 India Launch
  8. Realme 16 Series Gets Price Hike in India; OnePlus, Poco and Lava Also Revise Rates
  9. Verus Ethereum Bridge Reportedly Suffers from $11.5 Million DeFi Hack
  10. The Travellers Now Streaming on Netflix: Know Everything About This Australian Drama Film
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.