How does sound/track isolation work?

299 views

I watched a show today where they talked about isolating certain sounds (such as a particular speaker) from the background of a noisy restaurant to allow better conversations in a busy environment. I have also seen software programs that can isolate particular instruments/vocals from a song. How does this work? I understand in a video image you can use techniques like edge detection and like to define the boundaries of an object and isolate it from the background, but this is because you have a matrix of pixels to compare. With audio, you only have a single waveform that is the combined contribution of all the sounds coming into the ear or that make up the recording. How do you split that back out into its individual components? How do you isolate particular sounds, especially speech or instruments that can span a wide range of frequencies and volumes, from the rest of the noise? I just don’t understand how you undo the summation of the individual waveforms and pull them out from everything else.

In: 1

4 Answers

Anonymous 0 Comments

You can train an AI like Spleeter to remove or isolate certain sounds as well.

https://github.com/deezer/spleeter

You are viewing 1 out of 4 answers, click here to view all answers.