There are lots of hints that your ears and brain can pick up on to determine whether a voice is live or recorded.
First of all, recorded voices are often highly processed, particularly for things like TV and radio transmission. One type of processing that is often used is called compression. It’s not data compression (like a .zip file), but dynamic range compression. Essentially, it changes the volume of some parts of the signal, making it so that the loudest parts of the signal and the softest parts of the signal are much closer in volume than they were originally (thereby “compressing” the dynamic range of the signal). This has the effect of making speech easier to understand because it’s all coming out of your speaker at nearly the same volume, regardless of whether the person was yelling or mumbling. However, this kind of processing can produce an “unnatural” sound, and your brain can pick up on that. In the real world, the human voice can have a large dynamic range. When a voice sounds too “perfect”, it’s a clue to your brain that it’s a recording.
You may not notice it consciously, but voices on a TV very often have music or other sound effects in the background. This is an obvious clue that it’s recorded audio. Even if there’s no background music but there are some sound effects added (like footsteps, running water, horse hoofs, whatever), the sound effects often play at unnaturally loud volumes and your brain notices this as artificial.
Additionally, typical TV speakers are low cost and low quality, and they don’t accurately reproduce all frequencies of the human voice. This can lead to more unnatural sounds that your brain can recognize as artificial. But this effect is probably a lot more subtle in this case. If you were really listening to a voice in another room, you’d already be losing a ton of frequencies (due to air absorption, and the sound having to bounce off walls or go through walls to get to your ears), so low quality speakers would only have a very small effect after all of those other losses.
But in general, if you had relatively high quality speakers, and a relatively high-quality recording of a voice with minimal processing, it’s not difficult to make a convincing reproduction of a human voice. In this case, it would be absolutely impossible to tell if a voice coming from a room was live or recorded.
Latest Answers