• Home
  • Ai
  • Ai News
  • Meta’s New Open Source SAM Audio AI Model Can Isolate Sounds From Audio Mixtures

Meta’s New Open-Source SAM Audio AI Model Can Isolate Sounds From Audio Mixtures

Meta’s new SAM Audio AI model lets users isolate and edit sounds from mixed audio using text, visual or time prompts.

Meta’s New Open-Source SAM Audio AI Model Can Isolate Sounds From Audio Mixtures

Photo Credit: Meta

Meta’s release of SAM Audio comes a month after it released SAM 3 and SAM 3D

Click Here to Add Gadgets360 As A Trusted Source As A Preferred Source On Google
Highlights
  • SAM Audio is currently available in the Segment Anything Playground
  • The open-source model can also be downloaded from GitHub
  • Meta says the model can be used for noise filtering and isolating sounds
Advertisement

Meta has released another new artificial intelligence (AI) model in the Segment Anything Model (SAM) family. On Tuesday, the Menlo Park-based tech giant released SAM Audio, a large language model (LLM) that can identify, separate, and isolate particular sounds in an audio mixture. The model can handle audio editing based on either text prompts, visual signals, or time stamps, automating the entire workflow. Like the other models in the SAM series, it is also an open-source model that comes with a permissive licence.

Meta Introduces SAM Audio AI Model

In a newsroom post, the tech giant announced and detailed its new audio-focused AI model. SAM Audio is currently available to download either via Meta's website, GitHub listing, or Hugging Face. Those users who would prefer to use the model's capabilities without running it locally can visit the Segment Anything Playground to test it out. The website also allows users to access all the other SAM models. Notably, it is available under the SAM Licence, a custom, Meta-owned licence that allows both research-related and commercial usage.

Meta describes SAM Audio as a unified AI audio model that uses text-based commands, visual cues, and time-based instructions to identify and separate sounds from a complex mixture. Traditionally, audio editing, especially isolating individual sound elements, has required specialised tools and manual work, often with limited precision. Meta's latest entry in the SAM series addresses this gap.

The model supports three types of prompting. With text prompts, users can type descriptions, such as “drum beat” or “background noise.” Visual prompting allows users to click on an object or a human in a video, and if a sound is being produced from there, it will be isolated. Finally, time span prompting lets anyone mark a segment of the timeline to target a sound.

To highlight an example, imagine there is an audio file of a person speaking on the phone while music plays in the background, and children's voices can be heard playing at a distance. Users can isolate any of these audio sources, be it the primary voice, the music, or the ambient noise made by the children, with a single command. Gadgets 360 staff members briefly tested the model and found it to be both fast and efficient. However, we were not able to test it in real-world situations.

Under the hood, SAM Audio is a generative separation model that extracts both target and residual stems from an audio mixture. It is equipped with a flow-matching Diffusion Transformer and operates in a Descript Audio Codec - Variational Autoencoder Variant (DAC-VAE) space.

Comments

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Akash Dutta
Akash Dutta is a Chief Sub Editor at Gadgets 360. He is particularly interested in the social impact of technological developments and loves reading about emerging fields such as AI, metaverse, and fediverse. In his free time, he can be seen supporting his favourite football club - Chelsea, watching movies and anime, and sharing passionate opinions on food. More
Taskaree: The Smuggler’s Web OTT Release Date: When and Where to Watch Emraan Hashmi's Intense Crime Thriller
Development on The Elder Scrolls 6 Is 'Progressing Really Well', Says Bethesda Director Todd Howard

Advertisement

Follow Us

Advertisement

© Copyright Red Pixels Ventures Limited 2025. All rights reserved.
Trending Products »
Latest Tech News »