eli5 – How does an audio signal accurately reproduce complex sounds?


Tried to search, didn’t really find what I was looking for.

Whether vinyl, cassette, mp3 etc – how am I able to discern multiple different instruments in music perfectly clear from a single audio signal? How does a single groove in an LP allow me to hear a baseline and full drum kit and vocals clearly? I can understand one at a time but?

In: 0

Sound in the air combine with a single pressure wave that reach your eardrum. A microphone is the sam way a single membrane that moves and its position is used to record the sound. The recorded signal then drives the speaker with the help of an amplifier. The spark has a single membrane that produces a single pressure wave.

You do have two eardrums and the pressure wave that reaches each ear will not be identical. If you what to include a difference like that the answer is stereo sound. So two signals are recorded, stored, and reproduced. The are totally separate except that you store them on the same media in sync so you can reproduce them in sync when you play the sound back..

Waves add together.

Even with a single instrument playing a single note, there are multiple frequencies, harmonics, and overtones that you’re hearing. These add together, canceling each other out in some places and strengthening each other in other places. This creates a complex waveform that can still be drawn as a single line on an oscilloscope – just not a simple sine wave.

No matter how many different sounds you add together, at any given moment in time, everything that’s going on adds up to a single value. That means that no matter how many instruments you’re listening to, the needle on the record only needs to follow one complex path to reproduce the moment-to-moment sum of everything the mic picked up. A digital audio source is doing the same thing, just capturing a single number thousands of times per second.

The magic happens in your ears and brain where your cochlea unscrambles all the different frequencies and your auditory cortex interprets them into meaningfully distinct sounds.

In the studio, imagine a guitar setup and a computer setup to record it.

The guitar creates a complex sound, with multiple frequencies of varying volume levels and directions.

The microphone converts the sound waves created from the guitar and turns them into electrical signal. That electrical signal is converted from an analogue current into a digital signal by an audio interface.

The way it does this is by sampling the audio at very rapid intervals. Imagine a Serrano Ham on a deli counter, this represents the electrical signal. The audio interface is the slicer. It takes 44100 slices per second, which are very thin indeed.

Once the slices have been complete, the computer analyses these slices and puts them back together in order, playing 1 slice after 1 slice very rapidly with almost no perceptible gap in between them. This is what you hear when you playback a song on a computer or device like a smartphone. The slices (samples) are organised in a file format such as MP3, WAV or FLAC and the device you want to play it back on decodes the digital information and then asks the computer to tell the audio interface or soundcard to convert it back into electrical signal and then the soundcard pushes that signal down some wires to your speakers or headphones.

Now imagine an entire band in a studio – 2 guitarists, a bassist a drummer and a vocalist and hey throw in a synth player for good measure. There are now multiple microphones setup to record their sound waves. These microphones go back to a mixer to help concentrate the electrical signal of multiple microphones into one, single current that can be analysed by the audio interface.

The next bit is where real human skill is required. If you just record the microphone inputs it will sound very sub par indeed. The instruments will sound muddy and one directional, and everything will be different volume levels. This is where a human needs to “mix” the sound signals to make them audibly clear to another human listening to them back on their personal device. This is a whole other rabbit hole that if you like, I can explain clearly if you want me to, just reply to this comment.

After the mixing engineer has finished, he can then “bounce” or compile each separate instrument into one digital file using software. The file contains digital information about each instrument, its position in the stereo field, volume, effects etc that has been hard baked in from exporting from the music software.

The file is then played back on an audio device like a stereo, radio or phone and the signal is analysed, then converted back into electrical signal etc.

The sound that we hear comes from pressure waves that are pressing on our eardrums. At any one time, there is only one pressure on the eardrum.

Recording devices use a microphone to measure that pressure and then record it onto some medium. The playback device takes that record at reproduces the pressures the hit the microphone during the recording.

It’s *way* more complicated in theory, as recording and playback devices are imperfect, different instruments are record separately, etc.