Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview

Google says native audio dialog with Gemini 2.5 will support more than 24 languages, and allow language mixing.

Advertisement
Written by Akash Dutta, Edited by Siddharth Suvarna | Updated: 4 June 2025 17:51 IST
Highlights
  • Both new features are powered by the Gemini 2.5 Flash model
  • The features are available to try out in AI Studio and Streams platforms
  • TTS in Gemini 2.5 Flash allows multi-speaker dialogue generation

Google says all audio outputs from its models are embedded with SynthID

Photo Credit: Google

Google introduced new audio generation capabilities with the Gemini 2.5 models at the Google I/O 2025. The Mountain View-based tech giant is now letting developers and individuals test these features on its platform. The two new capabilities include native audio dialog and controllable text-to-speech (TTS) with Gemini 2.5 Flash preview. While the former can natively generate human-like audio while responding to user prompts, the latter can convert any script into conversational speech. These features are currently not available to developers via application programming interfaces (APIs).

Google Showcases Gemini 2.5 Flash's Audio Output Capabilities

In a blog post, the tech giant detailed the features of these two audio generation modes, highlighting how developers can use them to build new experiences for people. Currently, native audio dialog can be tried out in Google AI Studio's stream tab, whereas the TTS feature can be tested in the generate media tab within AI Studio.

Advertisement

Native audio dialog with Gemini 2.5 Flash preview is designed for real-time conversations between a human user and the AI. The user can either type a prompt or speak it, and the AI responds verbally. This process directly generates audio, instead of first generating text and then converting it into speech.

There are several advantages to that as well. It supports affective dialog, which means when Gemini 2.5 Flash responds to the user's tone of voice, it can recognise the emotion behind the said words. It can understand when the user sounds scared, angry, or surprised and respond accordingly.

Advertisement

Apart from this, the audio generation feature can express emotions when speaking, adopt different accents and linguistic styles, can access tools such as Google Search, and supports more than 24 languages.

Coming to the controllable TTS feature, it offers multi-speaker dialogue generation, can produce emotions and accents while narrating the script, control delivery speed and emphasise pronunciation, and supports the same 24 languages and language mixing.

Advertisement

Google says these capabilities were assessed for potential risks across the development process. The company used both internal mechanisms as well as red teaming to find and fix any vulnerabilities. The company also highlighted that all audio outputs from these models are embedded with SynthID, its watermarking technology.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Further reading: Google, Gemini, AI, Artificial Intelligence
Advertisement

Related Stories

Popular Mobile Brands
  1. Shift Up Comments on Design of Stellar Blade: Blood Rain's New Protagonist
  2. Redmi Note 17 Appears on Certification Website Ahead of Anticipated Debut
  3. Here's Everything That Was Announced at Nintendo Direct
  4. Realme P4R 5G Launched in India With an 8,000mAh Battery
  5. Resurrection on OTT: Streaming Details, Plot, Cast, and Reception of the Sci-Fi Epic
  6. Lava Bold N2 5G Goes on Sale in India With 6,000mAh Battery: Price, Offers
  7. Bitcoin Drops Below $61,300 as Crypto Investors Remain Cautious
  1. Samsung Galaxy S27 Listed on GSMA Database With Model Number Several Months Ahead of Anticipated Release: Report
  2. Bitcoin Drops Below $61,300 as Investors Remain Cautious Ahead of US Inflation Data
  3. Google Rolls Out Gemini 3.5 Live Translate for Real-Time Multilingual Conversations
  4. Resurrection OTT Release: Where to Watch Bi Gan’s Sci-Fi Fantasy Film Online
  5. Bhooth Bangla OTT Release Date Confirmed: When and Where to Watch Akshay Kumar’s Horror-Comedy Online?
  6. Realme P4R 5G Launched in India With an 8,000mAh Battery, MediaTek Dimensity 6300 SoC: Price, Features
  7. Meta Reportedly Directed to Offer Free WhatsApp Access to Rival AI Chatbots in the EU
  8. Opera for Android Gets New Home Screen With Google AI Mode Shortcut, Football Hub Ahead of FIFA World Cup 2026
  9. Vivo X Fold 6 Leak Reveals 'Atomic Workbench' and Gesture-Based Multitasking Controls
  10. No Tech Rule Exemption for Apple, EU Regulators Say Amid Spat Over Siri AI Delay
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.