For example, say 100 people each read / sang an identical passage into a recorder. It seems as though the frequency, amplitude etc that is captured wouldn’t anywhere near specific/precise enough the be accurately represented when played back. I.e what variables are at play that allow us to easily discern the 100 distinct voices when replayed? Thanks!
In: 5
The variables in play are frequency and amplitude. You can plot this in 2D and it’s called a “fourier transform” when you do.
The fourier transform of a signal is what is displayed on a mixing panel where you see the constantly-shifting spikes.
Long story short, the human voice contains many, many different frequencies at many different amplitudes.
If you’ve ever heard a pure sine wave, it’s a very simple sound. It’s a tone, like you’d get from a button press on an electronic device.
A constant sine wave at constant volume would be a single dot on that fourier transform. But if you’ve ever seen a signal of a human voice or the sound of a motor or running water or anything else, it’s not a dot it’s a whole jagged mountain range of different frequencies superimposed on one another.
That’s where the variation comes into play. As your vocal chords vibrate you’ve got air resonating in your chest and throat and sinuses. Those vibrations are being picked up and transformed by your bones and your fat and muscle. Basically in the sense that a guitar string is a simple instrument, but a guitar is a complex one, the human body is a very complex instrument and each one has a unique sound because each one has a unique physical shape and pattern of stiffness.
Latest Answers