So I understand the basic principles behind noise cancellation. You essentially use a microphone to record incoming sound waves and create an inverse wave that destructively interferes with the initial wave, thus, cancelling it out. But I don’t understand, practically, how this is done.
Let’s assume the sound wave makes contact with the microphone in the AirPod, which analyses the wave and shoots out an inverse wave, but by that point – the initial sound wave would surely have already reached my ears. The AirPod basically needs to cancel the sound wave before it moves roughly a centimetre or it’s too late.
The speed of sound (in a standard environment like air) is 343 meters per second or 34,300 centimetres per second; this means the AirPod has 1/34,300 seconds or ~0.03 miliseconds to do these operations to cancel the wave. That just seems absurd to me for such a tiny chip in the bloody AirPod.
Someone fix my confusion please.
In: 678
The microphones pick up the incoming sound and do a process called Fourier Analysis to identify the fundamental notes and *timing* (or *phase*) of those notes (simple sine waves at different frequencies that, when added together, make the final sound)
They then attempt to play the sum of the opposite waves at the right frequency *and* timing (phase) to cancel out that *actual* wave – not the repeated drone of that wave but that actual wave it heard. But this has to be synchronised correctly with the input wave (or it only makes things worse).
It can do this because it processes the signal so fast (and transmits what it wants to playback so quickly over copper wires) and sound travels so slowly (300 million m/s vs 300 m/s) so it can identify and start playback of the anti-sound at the right time before the wave passes the point before your ears where the speaker is… it’s listening an inch away from your eardrum, and calculating and playing the anti-sound in maybe half an inch of the the sound travelling thru air.
And it’s doing this continuously, listening to each wave in realtime as it’s actually arriving and actively cancelling each actual constantly varying wave.
Now unfortunately, due to various theoretical limits of the mathematical model used, it needs to listen to the sound for some time (I think it’s for at least 1.5 wavelengths of each fundamental note it can identify) – this how Fourier Analysis works, but you also need to be sampling the sound at twice the frequency of the each note you want to identify through the analysis.
If you have a 100 Hz wave the whole wave is 10 ms long. So, if you sample the signal at 200Hz, invert it and reproduce it with a lag of less than 1 ms you’re can do that. If you have a 1 kHz wave (which is “mid range” in audio) the whole wave is only 1 ms long. You’d have to be sampling the sound at 2Khz, and reproduce the signal in phase in less than 0.1 ms. As the frequency increases this reaction speed becomes difficult to impossible. And maintaining phase coherence is not that easy at higher frequency (amplitude difference is not such an issue) when you don’t know the precise distance between the mic and speaker.
So in the time it takes sound to travel that far, between the mic and the speaker, even if it could process the info in absolutely zero time, there’s a limit on the frequencies that can be effectively cancelled – low notes are easy but midrange and higher frequencies much less so.
Latest Answers