I see a lot of commenters have already covered the equipment side of things, but I’d like to bring in a bioacoustics perspective.
The key aspect of sound that we use to recognize voices is the frequency spectrum (the range and relative amplitudes of the frequencies that compose the sound).
Whenever sound travels or interacts with any objects it is **filtered** (the frequency spectrum changes). We are tuned to these changes in frequency — e.g. we learn what voice-behind-a-door sounds like, and can recognize it without needing visual cues. (I hope that’s a good example, let me know if that doesn’t make sense)
So, when you record and play back a voice, you are passing the sound through multiple filters – the air, the microphone, the audio processing that digitizes the sound, the speaker, and then more air – before it reaches your ears. The frequency spectrum changes, your brain detects those changes, and you recognize it as a recorded voice.
Latest Answers