Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview

Google says native audio dialog with Gemini 2.5 will support more than 24 languages, and allow language mixing.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 4 June 2025 17:51 IST
Highlights
  • Both new features are powered by the Gemini 2.5 Flash model
  • The features are available to try out in AI Studio and Streams platforms
  • TTS in Gemini 2.5 Flash allows multi-speaker dialogue generation

Google says all audio outputs from its models are embedded with SynthID

Photo Credit: Google

Google introduced new audio generation capabilities with the Gemini 2.5 models at the Google I/O 2025. The Mountain View-based tech giant is now letting developers and individuals test these features on its platform. The two new capabilities include native audio dialog and controllable text-to-speech (TTS) with Gemini 2.5 Flash preview. While the former can natively generate human-like audio while responding to user prompts, the latter can convert any script into conversational speech. These features are currently not available to developers via application programming interfaces (APIs).

Google Showcases Gemini 2.5 Flash's Audio Output Capabilities

In a blog post, the tech giant detailed the features of these two audio generation modes, highlighting how developers can use them to build new experiences for people. Currently, native audio dialog can be tried out in Google AI Studio's stream tab, whereas the TTS feature can be tested in the generate media tab within AI Studio.

Advertisement

Native audio dialog with Gemini 2.5 Flash preview is designed for real-time conversations between a human user and the AI. The user can either type a prompt or speak it, and the AI responds verbally. This process directly generates audio, instead of first generating text and then converting it into speech.

There are several advantages to that as well. It supports affective dialog, which means when Gemini 2.5 Flash responds to the user's tone of voice, it can recognise the emotion behind the said words. It can understand when the user sounds scared, angry, or surprised and respond accordingly.

Advertisement

Apart from this, the audio generation feature can express emotions when speaking, adopt different accents and linguistic styles, can access tools such as Google Search, and supports more than 24 languages.

Coming to the controllable TTS feature, it offers multi-speaker dialogue generation, can produce emotions and accents while narrating the script, control delivery speed and emphasise pronunciation, and supports the same 24 languages and language mixing.

Advertisement

Google says these capabilities were assessed for potential risks across the development process. The company used both internal mechanisms as well as red teaming to find and fix any vulnerabilities. The company also highlighted that all audio outputs from these models are embedded with SynthID, its watermarking technology.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Further reading: Google, Gemini, AI, Artificial Intelligence
Advertisement

Related Stories

Popular Mobile Brands
  1. Honor 600 Pro, Honor 600 Finally Debut With 7,000mAh Batteries: See Prices
  2. Sennheiser CX 80U, HD 400U With USB Type-C Connectivity Launched in India
  3. OnePlus 16 Leak Reveals Promising Display, Performance Upgrades
  4. Motorola Edge 70 Pro Arrives With a 6,500mAh Battery at This Price in India
  1. Google's Gemini App Might Be Updated With Brighter UI, Redesigned Layout on Android: Report
  2. Microsoft Picks Daniel Shapero as New LinkedIn CEO as Part of Major Management Shuffle
  3. OnePlus 16 Could Feature Up to 240Hz Display and Snapdragon 8 Elite Gen 6 Pro SoC, Tipster Claims
  4. Vivo Y600 Pro China Launch Date Announced; Confirmed to Feature 1.5K Display, 10,200mAh Battery
  5. NASA’s Curiosity Rover Finds Crater Filled With Sand, Alters Drilling Plans
  6. Control Ultimate Edition Arrives on iPhone and iPad With Touch Controls, Universal Purchase
  7. Asus ExpertBook Ultra With Intel Core Ultra X7 Series 3 CPU Launched in India Alongside ExpertBook P3, ExpertBook P5 Series
  8. Boat Aavante Prime X Soundbar Launched in India With Dolby Atmos, Wireless Satellite Speakers: Price, Features
  9. Qualcomm CEO Reportedly Visits Samsung Foundry in Korea to Discuss Producing 2nm Chips
  10. Coinbase Announces USDC-INR Trading Services for Users in India
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.