Home
Ai
Ai News
Microsoft Introduces Native Audio Generation in Copilot With Multiple Expressive Voices

Microsoft Introduces Native Audio Generation in Copilot With Multiple Expressive Voices

Microsoft’s latest audio generation in Copilot is powered by the homegrown MAI-Voice-1 AI model.

Written by Akash Dutta, Edited by Ketan Pratap | Updated: 12 September 2025 16:28 IST

Microsoft Introduces Native Audio Generation in Copilot With Multiple Expressive Voices

Photo Credit: Microsoft

The new Copilot feature is available to all users of the platform via a personal account

Click Here to Add Gadgets360 As A Trusted Source

Highlights

Native audio generation in Copilot is available in three different modes
It is currently part of the Copilot Labs experience
The three modes include Scripted, Emotive, and Story

Microsoft is adding another new artificial intelligence (AI) feature to Copilot, giving it the ability to natively generate audio. On Wednesday, the Redmond-based tech giant announced that Copilot is getting a new audio generation feature where users will be able to hand it a script and it will convert it into an AI voiceover in different styles. Since it is native voice generation, none of the modes will sound like typical text-to-speech models. Notably, the company is powering this capability via the homegrown MAI-Voice-1 AI model.

Microsoft's Copilot Can Now Read Aloud Your Scripts

In a post on X (formerly known as Twitter), Mustafa Suleyman, CEO of Microsoft AI, announced the release of Copilot's new audio generation modes. He highlighted that these are powered by the MAI-Voice-1 AI model, which was released at the end of August. Currently, this experience is only available via Copilot Labs when signing in using a personal account.

You asked, we shipped! Scripted mode just dropped for audio generation in Copilot Labs (c/o our new MAI-Voice-1 model).
Scripted mode: reads your input verbatim
Emotive: riffs a bit for max drama
Story: performs multiple voices/characters
Try out all 3 ➡️ https://t.co/9hL81LTFwF pic.twitter.com/rOVZKGbDjX
— Mustafa Suleyman (@mustafasuleyman) September 10, 2025

There are three modes to try out. First is the Scripted mode, where the AI chatbot reads out the input verbatim, without adding any unnecessary flair or style. These are best used for tasks such as formal announcements, document narration, and information presentation.

The second mode is dubbed Emotive. Suleyman says it is more focused on making the input sound dramatic and flashy. The voice here will include a wide range of intonation, pitch, and tone to deliver a performative piece. This is ideal for advertising, marketing, or informal narration.

Copilot+ PCs Will Now Let You Search for a File By Describing It

Copilot's final audio generation mode is Story. This is the most versatile format, which includes multiple voices and characters. The company says this mode is ideal for storytelling, podcast-like presentations, and analysis-related tasks. The feature is currently free to use, although Microsoft has not mentioned any rate limits. It is unclear when the feature will be released into the Copilot mobile and desktop apps.

Notably, at the time of release, Microsoft said the MAI-Voice-1 is a speech generation model that natively generates expressive and natural-sounding voice. It can generate a full minute of audio in under a second on a single GPU. The tech giant trained the model on around 15,000 Nvidia GPUs.

Comments

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Microsoft, Copilot, AI, Artificial Intelligence, AI Audio

Akash Dutta Email Akash Dutta

Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More