Google Unveils AI System That Can Isolate an Individual Voice in a Crowd

Advertisement
By Indo-Asian News Service | Updated: 13 April 2018 16:00 IST
Highlights
  • Method works on ordinary videos with a single audio track
  • This capability can have a wide range of applications
  • Applications include speech enhancement and improved hearing aids

Just as most smartphone cameras now allow users to focus on a single object among many, it may soon be possible to pick out individual voices in a crowd by suppressing all other sounds, thanks to a new Artificial Intelligence (AI) system developed by Google researchers.

This is an important development as computers are not as good as humans at focusing their attention on a particular person in a noisy environment.

Known as the cocktail party effect, the capability to mentally "mute" all other voices and sounds comes natural to us humans.

Advertisement

 

However, automatic speech separation - separating an audio signal into its individual speech sources -- remains a significant challenge for computers, Inbar Mosseri and Oran Lang, software engineers at Google Research, wrote in a blog post this week.

Advertisement

In a new paper, the researchers presented a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise.

"In this work, we are able to computationally produce videos in which speech of specific people is enhanced while all other sounds are suppressed," Mosseri and Lang said.

Advertisement

The method works on ordinary videos with a single audio track, and all that is required from the user is to select the face of the person in the video they want to hear, or to have such a person be selected algorithmically based on context.

The researchers believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking.

Advertisement

"A unique aspect of our technique is in combining both the auditory and visual signals of an input video to separate the speech," the researchers said.

"Intuitively, movements of a person's mouth, for example, should correlate with the sounds produced as that person is speaking, which in turn can help identify which parts of the audio correspond to that person," they explained.

The visual signal not only improves the speech separation quality significantly in cases of mixed speech, but, importantly, it also associates the separated, clean speech tracks with the visible speakers in the video, the researchers said.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Further reading: Science, Google, AI, Voice Recognition
Advertisement

Related Stories

Popular Mobile Brands
  1. Here's How Much the Samsung Galaxy Z TriFold May Cost in India
  2. OnePlus Ace 6T With Massive 8,300mAh Battery Launched at This Price
  3. iPhone 16 Price Drops Under Rs. 63,000 on Croma With Bank Discounts
  4. Redmi 15C 5G Launched in India With These Specifications
  5. Vivo X300 Pro Review: Flagship Mobile Photography. Redefined.
  6. Mrs Deshpande OTT Release: When, Where to Watch Madhuri Dixit's Serial Killer Mystery
  7. Motorola Edge 70 India Launch Date Leaked; Might Arrive With Bigger Battery
  8. AWS Unveils AI Agents That Can Independently Handle Software Development
  9. Google's Latest Android 16 Update Brings These Features to Your Pixel Phone
  10. Samsung's One UI 8.5 Changelog Leak Hints at Imminent Beta Release
  1. Motorola Edge 70 India Launch Date Leaked; Indian Variant Said to Feature Bigger Battery, Slim Design
  2. UK to Recognise Crypto as Property After Lawmakers Approve Landmark Bill
  3. Dyson HushJet Purifier Compact Launched in India With Electrostatic Filter, AQI Indicator, New HushJet Nozzle Design
  4. Samsung's One UI 8.5 Changelog Leak Hints at Imminent Beta Release
  5. AWS Unveils Frontier AI Agents for Enterprises, Can Operate for Days Without Intervention
  6. Government Says Sanchar Saathi App Optional, Can Be Removed; Apple Reportedly Plans to Oppose Mandatory Installation
  7. Government Removes Sanchar Saathi Pre-Installation Mandate After Pushback
  8. OnePlus Ace 6T Launched With 8,300mAh Battery, Snapdragon 8 Gen 5 SoC: Price, Specifications
  9. ChatGPT Could Soon Be Integrated With Apple Health App: Report
  10. Apple's Foldable iPhone Reportedly at Pre-Production Stage, Might Feature Vapour Chamber Cooling
Gadgets 360 is available in
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2025. All rights reserved.