Search

Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview

Google says native audio dialog with Gemini 2.5 will support more than 24 languages, and allow language mixing.

Advertisement
Highlights
  • Both new features are powered by the Gemini 2.5 Flash model
  • The features are available to try out in AI Studio and Streams platforms
  • TTS in Gemini 2.5 Flash allows multi-speaker dialogue generation
Google Opens Access to Gemini 2.5 Native Audio Dialog and Controllable Speech Generation in Preview

Google says all audio outputs from its models are embedded with SynthID

Photo Credit: Google

Google introduced new audio generation capabilities with the Gemini 2.5 models at the Google I/O 2025. The Mountain View-based tech giant is now letting developers and individuals test these features on its platform. The two new capabilities include native audio dialog and controllable text-to-speech (TTS) with Gemini 2.5 Flash preview. While the former can natively generate human-like audio while responding to user prompts, the latter can convert any script into conversational speech. These features are currently not available to developers via application programming interfaces (APIs).

Google Showcases Gemini 2.5 Flash's Audio Output Capabilities

In a blog post, the tech giant detailed the features of these two audio generation modes, highlighting how developers can use them to build new experiences for people. Currently, native audio dialog can be tried out in Google AI Studio's stream tab, whereas the TTS feature can be tested in the generate media tab within AI Studio.

Native audio dialog with Gemini 2.5 Flash preview is designed for real-time conversations between a human user and the AI. The user can either type a prompt or speak it, and the AI responds verbally. This process directly generates audio, instead of first generating text and then converting it into speech.

There are several advantages to that as well. It supports affective dialog, which means when Gemini 2.5 Flash responds to the user's tone of voice, it can recognise the emotion behind the said words. It can understand when the user sounds scared, angry, or surprised and respond accordingly.

Apart from this, the audio generation feature can express emotions when speaking, adopt different accents and linguistic styles, can access tools such as Google Search, and supports more than 24 languages.

Coming to the controllable TTS feature, it offers multi-speaker dialogue generation, can produce emotions and accents while narrating the script, control delivery speed and emphasise pronunciation, and supports the same 24 languages and language mixing.

Google says these capabilities were assessed for potential risks across the development process. The company used both internal mechanisms as well as red teaming to find and fix any vulnerabilities. The company also highlighted that all audio outputs from these models are embedded with SynthID, its watermarking technology.

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Further reading: Google, Gemini, AI, Artificial Intelligence
 
Show Full Article
Please wait...
Advertisement

Related Stories

Popular Mobile Brands
  1. Poco F7 5G India Launch Today: How to Watch, Expected Price and Features
  2. Vivo X200 FE Compact Smartphone Launched With 6,500mAh Battery
  3. Vivo T4 Lite 5G India Launch Set for Today: Expected Price and Features
  4. Oppo K13x 5G With 6,000mAh Battery Launched in India: See Price
  5. AI+ Pulse, AI+ Nova 5G India Launch Timeline, Design and Colours Revealed
  6. Redmi A4 5G Gets a New RAM and Storage Variant in India
  7. OnePlus Bullets Wireless Z3 Review: Your New Go-To Budget Neckband?
  8. Nothing Phone 3 Full Specifications Surface Ahead of Its July 1 Debut
  9. Boat Airdopes Prime 701 ANC With Up to 50 Hours Battery Launched in India
  10. BSNL 5G FWA Plans in India to Start at Rs. 999 a Month With 100Mbps Speed
  1. Poco F7 5G Launch in India Today: How to Watch Livestream, Expected Price, Specifications
  2. ‘Ghost’ Plume Found Beneath Oman May Explain India’s Ancient Tectonic Shift
  3. Blue Origin’s Crewed Suborbital Launch Delayed Again Due to Weather Conditions
  4. Green Rooftops Could Help Cities Like Shanghai Filter Out Tons of Microplastics from Rainwater
  5. SpaceX to Launch Over 150 Memorial DNA Capsules into Orbit on Celestis’ Perseverance Flight
  6. Rubin Observatory to Unveil First Cosmic Images with World’s Largest Digital Camera
  7. The Gilded Age OTT Release: Where to Watch This HBO Original Series
  8. Cleaner (2025) OTT Release Date: When and Where to Watch it Online?
  9. Yugi Now Available for Streaming on Aha Tamil: Everything You Need to Know
  10. Samsung Exynos 2500 SoC With Up to 15 Percent Improved CPU Performance, Xclipse 950 GPU Launched
Gadgets 360 is available in
Download Our Apps
App Store App Store
Available in Hindi
App Store
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »