eli5 – How do spacial audio technologies like Dolby Atmos work in headphones, with only two drivers?

148 views

eli5 – How do spacial audio technologies like Dolby Atmos work in headphones, with only two drivers?

In: 5

5 Answers

Anonymous 0 Comments

You only have two ears. Each of your ears can pick up one sound signal, i.e. changes in air pressure over time. No matter how many sources of sound there are in different places around you, in the end this is all the information you get: two signals measured at two points on either side of your head. So, to recreate the experience of sound coming from different points in space, all you need to do is recreate the two sound waveforms that these sound sources would create in your two ears.

How do you do this? The best way is to stick two microphones in your own ear canals to record the sound. That way you don’t have to artificially recreate anything – all the spatial information is already there. This works best if you record it in your own ears, since some of the spatial information in the sound waves comes from the way they interact with the outer part of your ear (the *pinnae* – i.e. the funny-shaped cartilage-and-skin protrusions that you picture when you think of ears). But it will also work if you record it in your own ears, or on an artificial set of ears, and then play it back in someone else’s ears. It just won’t be as accurate.

If you can’t use this method, then you have to take a recorded sound and insert the spatial information somehow. Spatial information in sound comes from a few things, but the most important ones are interaural time and loudness differences. Sound travels at about 300 m/s, and so, with your ears spaced about 20 cm apart, a sound coming from your left will reach your left ear about 0.6 ms sooner than your right ear. A sound coming from right in front of you will reach both ears at the same time, and a sound coming from your right will get to your right ear 0.6 ms sooner. So, by comparing the sounds in your two ears, your brain can tell how far to the left or right of you a sound was produced.

Loudness follows a similar principle. Sounds are louder in the ear that they are closer to, so by comparing the loudness of the same sound picked up in your two ears, your brain can tell how close to each ear it was produced, and thus how far to the left or right the sound source was located.

Using software, you can take a sound and recreate these timing and intensity differences, and thus introduce the desired spatial information. Of course, this only works for simulating spatial information in the left-right dimension. So how do you do front-back or top-bottom (i.e. how high up a sound source was, or how far in front or behind you)? This is where the pinnae come in. The outer parts of your ear act like a directional filter, meaning they let through (or reflect) different sound frequencies differently depending on where the sound is coming from. Your brain has learned how your particular ears do this and so by comparing the frequency profile of the same sound in your two ears, it can figure out where the sound must have come from to have produced this particular filtered frequency signature. However, this is not as accurate as the localization based on interaural time and loudness differences. And, as I said, it is somewhat dependent on the shape of your ears. We all have somewhat similar ears of course, but there are subtle (or not-so-subtle) differences, and also these frequency distortions depend not just on the shape of your ears but also on the individual shape and anatomy of your head. So it’s much harder to recreate these spatial effects accurately, but you can at least do it to a rough approximation, by using a model that captures the frequency distortion profile for the average person.

You are viewing 1 out of 5 answers, click here to view all answers.