I watched a show today where they talked about isolating certain sounds (such as a particular speaker) from the background of a noisy restaurant to allow better conversations in a busy environment. I have also seen software programs that can isolate particular instruments/vocals from a song. How does this work? I understand in a video image you can use techniques like edge detection and like to define the boundaries of an object and isolate it from the background, but this is because you have a matrix of pixels to compare. With audio, you only have a single waveform that is the combined contribution of all the sounds coming into the ear or that make up the recording. How do you split that back out into its individual components? How do you isolate particular sounds, especially speech or instruments that can span a wide range of frequencies and volumes, from the rest of the noise? I just don’t understand how you undo the summation of the individual waveforms and pull them out from everything else.
In: 1
Generally in many TV/Movie productions little of the original audio is used. Usually everything gets re-recorded so that all the different sounds are isolated and can be manipulated separately. There is no magical software that can cleanly isolate all sounds. There are tools that help, but more often used in emergency situations and just help reduce unwanted stuff a little. But most of the time, the dialog is re-recorded in a studio environment, the background sounds are re-recorded in a Foley studio, and the music is added later (music that is supposed to be from a jukebox or whatever is playing music in the venue the scene takes place in).
There are sometimes techniques used where a separate take of the background noise is recorded, then used by a computer to know what to look for to try to remove that background noise using the sample recording that doesn’t contain the audio that they want to keep. It’s very far from perfect, but in situations where you absolutely need the original dialog, it can help isolate that dialog better enough for the original background ambience to be hidden with overdubbed ambience.
Latest Answers