Eli5:When one speaker is playing two (or more) frequencies of sound, think harmony in music) the resulting sine wave is an average of those two frequencies. How does the brain interpret it as two different notes when one speaker is playing the average of two frequencies?

In: 11

It is a sum of all the frequencies, not an average, which are not necessarily sine waves by the way.

>resulting sine wave is an average of those two frequencies.

This is incorrect. The resulting sine wave is a sum of both waves and will retain frequency information of both waves. As the other reply describes, the frequency information is picked up by your ear and then your brain.

One of the key concepts is that waves do not “average”. If you have two sine waves of wavelength x and 0.5x, they do not average to a sine wave of 0.75x. Rather, it forms a complex wave that has wavelength x, but not perfectly sinusoidal anymore. This is hard to describe with words, so feel free to plot this and visualize this (https://www.mathopenref.com/graphfunctions.html).

f(x) = sin(x) for your first sine wave

g(x) = sin(2*x) or any other integer for a different frequency

h(x) = sin(2*x)+sin(x) to simulate the “harmony” or the two frequencies together.

You’d see how the resulting wave shows elements from both waves, but are not an average in any sense when it comes to wavelength/frequency information, only with amplitude.

Fun fact: whether two notes are harmonic or dissonant have to do with whether they have common factors. For example, C and G harmonize because their frequencies are in 2:3 ratio.

Your speaker is not outputting a sine wave. It’s vibrating in such a way that it creates complex and ever changing waveforms that your ear interprets as separate sounds.

One way you could maybe look at it is that microphones are usually a single moving diaphragm. So how can microphones pick up multiple sounds from multiple sources? Because those sounds cause the diaphragm in the microphone to move in very complex ways. Which then get converted to an electronic signal, and then when it goes out a speaker the process is simply reversed. So if a mic can pick up multiple sounds, then a speaker can spit them back out.

The average of two sine waves is **not** a sine wave: [this is how it looks](https://www.desmos.com/calculator/yciw4ps3tm). While it looks “siney”, it also clearly has unequal peaks. It is also possible to “decompose” it back into original sines. Our ears have mechanical “sine decompositors” inside: they have parts that can only vibrate at some specific frequency, so each only picks up one sine wave. The brain gets sines already decomposed.

Sound waves always overlap (superimpose). But the ear is able to isolate specific frequencies from the superimposed sound waves. Different parts of the inner ear cochlea (a snail shaped organ filled with fluid), and different groups of hair-like cells within the cochlea are (edit: relatively more) sensitive to different frequencies. This allows the sensory organ, and in-turn the brain, to isolate individual notes from the super-imposed wave.

You can find more details here: https://en.wikipedia.org/wiki/Auditory_system?wprov=sfla1

Also check the humans section here, for a relatively shorter and simpler explanation: https://en.wikipedia.org/wiki/Hearing_range?wprov=sfla1