Google Makes DeepMind's AI-Powered Cloud Text-to-Speech Service Available to Developers

Advertisement
By Indo-Asian News Service | Updated: 28 March 2018 19:01 IST
Google Makes DeepMind's AI-Powered Cloud Text-to-Speech Service Available to Developers

Photo Credit: Bloomberg

Google on Wednesday launched a voice synthesiser called "Cloud Text-to-Speech" which is powered by its Britain-based Artificial Intelligence (AI) subsidiary DeepMind.

The service is now available for developers to add it in their own applications.

A text-to-speech service is a form of speech synthesis that converts text into spoken voice output. Google's text-to-speech powers the voices in service like Google Assistant, Search and Maps.

"'Cloud Text-to-Speech' lets developers choose from 32 different voices from 12 languages and variants," Dan Aharon, Product Manager, Cloud AI, said in a blog post.

Advertisement

"Cloud Text-to-Speech" correctly pronounces complex text such as names, dates, times and addresses for authentic-sounding speech, the company claimed.

It also allows developers to customise pitch, speaking rate and volume gain, and supports a variety of audio formats, including MP3 and WAV.

Advertisement

According to Google, "Cloud Text-to-Speech" can be used in a variety of ways, including to power voice response systems for call centres (IVRs) and enabling real-time natural language conversations, to enable Internet of Things (IoT) devices to talk back and to convert text-based media into spoken format.

Google said that "Cloud Text-to-Speech" includes a selection of high-fidelity voices built using WaveNet - a neural network trained with a large volume of speech samples that is able to create raw audio waveforms from scratch.

Advertisement

DeepMind introduced the first version of WaveNet in late 2016.

WaveNet synthesises more natural-sounding speech and, on average, produces speech audio that people prefer over other text-to-speech technologies.

During training, the network extracts the structure of the speech, including tones and what shape a realistic speech waveform should have.

When given text input, the trained WaveNet model generates the corresponding speech waveforms, one sample at a time, achieving higher accuracy than alternative approaches.

Today's improved WaveNet model generates raw waveforms 1,000 times faster than the original model and can generate one second of speech in just 50 milliseconds.

The model also has higher-fidelity and is capable of creating waveforms with 24,000 samples a second.

"We have also increased the resolution of each sample from 8 bits to 16 bits, producing higher quality audio for a more human sound," Aharon added.

With these adjustments, the latest WaveNet model produces more natural sounding speech and people have given the new US English WaveNet voices an average mean-opinion-score (MOS) of 4.1 on a scale of one-five.

 

For the latest tech news and reviews, follow Gadgets 360 on X, Facebook, WhatsApp, Threads and Google News. For the latest videos on gadgets and tech, subscribe to our YouTube channel. If you want to know everything about top influencers, follow our in-house Who'sThat360 on Instagram and YouTube.

Advertisement

Related Stories

Popular Mobile Brands
  1. Samsung Galaxy Buds 3 Pro's Amazon Prime Day 2025 Offer Revealed
  2. OnePlus Nord 5, OnePlus Nord CE 5 Launched in India at These Prices
  3. AI+ Pulse, AI+ Nova 5G With 50-Megapixel Rear Cameras Launched in India
  4. Oppo Reno 14 Gets a New Variant With a Colour Changing Rear Panel
  5. OnePlus Nord CE 5 Review
  6. WhatsApp's AI-Powered Chat Wallpaper Feature Is Coming to iOS
  7. Samsung Galaxy Unpacked 2025 Event Today: How to Watch Livestream
  8. Realme 15 5G, 15 Pro 5G to Launch in India on This Date
  9. Ai+ Wearbuds Smartwatch Launched in India With Built-In TWS Earbuds
  10. OnePlus Nord 5 Review
  1. Google Pixel Phones Receiving Android 16-Based Monthly Software Update for July 2025: What’s New
  2. Samsung Galaxy Unpacked 2025 Event Today: Galaxy Z Fold 7, Z Flip 7 Launch Expected, How to Watch Livestream
  3. Vivo V60 Reportedly Listed on SIRIM and TUV Websites, Could Launch Soon
  4. Amazon Prime Day 2025 Sale: iQOO 13, iQOO Neo 10R, iQOO Z10x and More to Go on Sale at Discounted Prices
  5. Swiggy Instamart Teams Up With Jio for Instant Delivery of JioBharat V4 and JioPhone Prima 2
  6. Apple Maps in iOS 26 Beta Version Come With An Upgraded Search Feature: Report
  7. WhatsApp Rolls Out AI-Powered Chat Wallpaper Feature; Threaded Message Replies Spotted in Development
  8. Samsung Galaxy Watch 8 Series Could Launch With Gemini Voice Assistant
  9. Amazon Prime Day 2025 Sale: Samsung Galaxy Buds 3 Pro to Be Available at a Discounted Price
  10. Oppo Reno 14 Launched in New Finish With Temperature-Sensitive Colour Changing Rear Panel
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.