Meta’s New Open-Source SAM Audio AI Model Can Isolate Sounds From Audio Mixtures

Meta’s new SAM Audio AI model lets users isolate and edit sounds from mixed audio using text, visual or time prompts.

Advertisement
Written by Akash Dutta, Edited by Ketan Pratap | Updated: 17 December 2025 16:20 IST
Highlights
  • SAM Audio is currently available in the Segment Anything Playground
  • The open-source model can also be downloaded from GitHub
  • Meta says the model can be used for noise filtering and isolating sounds

Meta’s release of SAM Audio comes a month after it released SAM 3 and SAM 3D

Photo Credit: Meta

Meta has released another new artificial intelligence (AI) model in the Segment Anything Model (SAM) family. On Tuesday, the Menlo Park-based tech giant released SAM Audio, a large language model (LLM) that can identify, separate, and isolate particular sounds in an audio mixture. The model can handle audio editing based on either text prompts, visual signals, or time stamps, automating the entire workflow. Like the other models in the SAM series, it is also an open-source model that comes with a permissive licence.

Meta Introduces SAM Audio AI Model

In a newsroom post, the tech giant announced and detailed its new audio-focused AI model. SAM Audio is currently available to download either via Meta's website, GitHub listing, or Hugging Face. Those users who would prefer to use the model's capabilities without running it locally can visit the Segment Anything Playground to test it out. The website also allows users to access all the other SAM models. Notably, it is available under the SAM Licence, a custom, Meta-owned licence that allows both research-related and commercial usage.

Advertisement

Meta describes SAM Audio as a unified AI audio model that uses text-based commands, visual cues, and time-based instructions to identify and separate sounds from a complex mixture. Traditionally, audio editing, especially isolating individual sound elements, has required specialised tools and manual work, often with limited precision. Meta's latest entry in the SAM series addresses this gap.

The model supports three types of prompting. With text prompts, users can type descriptions, such as “drum beat” or “background noise.” Visual prompting allows users to click on an object or a human in a video, and if a sound is being produced from there, it will be isolated. Finally, time span prompting lets anyone mark a segment of the timeline to target a sound.

Advertisement

To highlight an example, imagine there is an audio file of a person speaking on the phone while music plays in the background, and children's voices can be heard playing at a distance. Users can isolate any of these audio sources, be it the primary voice, the music, or the ambient noise made by the children, with a single command. Gadgets 360 staff members briefly tested the model and found it to be both fast and efficient. However, we were not able to test it in real-world situations.

Under the hood, SAM Audio is a generative separation model that extracts both target and residual stems from an audio mixture. It is equipped with a flow-matching Diffusion Transformer and operates in a Descript Audio Codec - Variational Autoencoder Variant (DAC-VAE) space.

 

Get your daily dose of tech news, reviews, and insights, in under 80 characters on Gadgets 360 Turbo. Connect with fellow tech lovers on our Forum. Follow us on X, Facebook, WhatsApp, Threads and Google News for instant updates. Catch all the action on our YouTube channel.

Advertisement
Popular Mobile Brands
  1. Commodore Returns With a Callback 8020 Flip Phone to Curb Phone Addiction
  2. Redmi Turbo 5 With 7,540mAh Battery Arrives in India at This Price
  3. Drishyam 3 OTT Release Date: When and Where to Watch Mohanlal's Crime Thriller Online?
  4. Tecno Spark 50 Pro Unveiled With Helio G100 Ultimate, Sony LYT-600 Camera
  5. Redmi Turbo 5 vs Motorola Edge 70 Pro vs Samsung Galaxy A37 5G Compared
  6. Samsung Galaxy Book 6 Edge Launched as Firm's First Snapdragon X2 Elite PC
  7. OnePlus 16 Said to Feature 185Hz Refresh Rate Display
  8. Athiradi OTT Release Date: When and Where to Watch it Online?
  9. Vivo X Fold 6 Will Launch in China on This Date
  10. Lenovo Tab Plus Gen 2 Launched With JBL Speaker System
  1. Scientists May Have Solved the Missing Sulfur Mystery in Star-Forming Clouds
  2. Samsung Galaxy Z Fold 8 Listed on US FCC Database With Snapdragon Chipset
  3. Spotify Upgrades Collaborative Playlists Feature With Emoji-Based Reactions for Tracks
  4. Huawei Patent Document Describes 'Vertical' Trifold Smartphone With Two Hinges
  5. US Regulator Urges FDIC for Better Coordination on Crypto, Blockchain Risks
  6. Lenovo Tab Plus Gen 2 Launched With Dimensity 7400 SoC, JBL Speaker System: Price, Specifications
  7. Commodore Callback 8020 Flip Phone With Sailfish OS Unveiled as 'Digital Detox' Smartphone
  8. WhatsApp Said to Be Developing View-Once Text Messages Feature for iOS App
  9. Oppo Reno 16 Series Key Features Revealed via European Certifications Ahead of Global Debut
  10. Redmi Turbo 5 vs Motorola Edge 70 Pro vs Samsung Galaxy A37 5G: Price in India, Specifications Compared
Download Our Apps
Available in Hindi
© Copyright Red Pixels Ventures Limited 2026. All rights reserved.