I watched a show today where they talked about isolating certain sounds (such as a particular speaker) from the background of a noisy restaurant to allow better conversations in a busy environment. I have also seen software programs that can isolate particular instruments/vocals from a song. How does this work? I understand in a video image you can use techniques like edge detection and like to define the boundaries of an object and isolate it from the background, but this is because you have a matrix of pixels to compare. With audio, you only have a single waveform that is the combined contribution of all the sounds coming into the ear or that make up the recording. How do you split that back out into its individual components? How do you isolate particular sounds, especially speech or instruments that can span a wide range of frequencies and volumes, from the rest of the noise? I just don’t understand how you undo the summation of the individual waveforms and pull them out from everything else.
In: 1
Sound comes as a certain set of vibrating waves. The different the wave is shaped the different it sounds. If you measure the length of each wave you’ll be able to see certain sounds group at certain value of Hertz (the measure of the waves length). If you take out for example, all sound at x hertz, you may be eliminating all low end booming like a bass drum etc
Generally in many TV/Movie productions little of the original audio is used. Usually everything gets re-recorded so that all the different sounds are isolated and can be manipulated separately. There is no magical software that can cleanly isolate all sounds. There are tools that help, but more often used in emergency situations and just help reduce unwanted stuff a little. But most of the time, the dialog is re-recorded in a studio environment, the background sounds are re-recorded in a Foley studio, and the music is added later (music that is supposed to be from a jukebox or whatever is playing music in the venue the scene takes place in).
There are sometimes techniques used where a separate take of the background noise is recorded, then used by a computer to know what to look for to try to remove that background noise using the sample recording that doesn’t contain the audio that they want to keep. It’s very far from perfect, but in situations where you absolutely need the original dialog, it can help isolate that dialog better enough for the original background ambience to be hidden with overdubbed ambience.
They are isolated from the start. Music is typically recording one instrument at a time on separate recording channels. Even drum kits have individual microphones on each drum, cybal, etc that each feed their own channels. These individual channels are later mixed down into the stereo channels you are used to playing them back in.
You can not effectively isolate a single instrument from a standard commercial music recording like you would purchase from iTunes. You would need the master recording with the individual tracks, or some other type of special mix–such as Guitar Hero or Rock Band tracks–where the individual instruments were broken out separately during mixdown.
Latest Answers