The MacBook knows exactly what is playing through its speakers. When it picks up that same sound in the mic, it subtracts that from the mic input, cancelling it out. The only sound that gets transmitted is everything else besides what the MacBook plays.
If the MacBook doesn’t know what sound is being played, for example a separate phone playing music, it can’t cancel it out.
Also, if the sound from the MacBook speakers is too loud, it can overwhelm the mic and prevent other sounds from being recorded on top.
Its a cool little amplifier called the difference amplifier. It is one of the first amps you learn about in Electrical engineering.
The difference amp has 2 inputs. What happens is it will take the 2 inputs, subtract them from one another through signal combining and only the difference between the 2 inputs gets actually amplified.
It compares the output to the input and if sees the same thing, it get attenuated (volume is greatly lowered).
This is also how people on a helicopter can talk through their headsets. Both head sets are picking up the sound of the rotors so they cancel each other out, but only my mic picks up my voice, not your mic so you clearly hear my voice over the rotors. Its a neat little device and just requires a couple transistors to make one.
In addition to other answers: modern Mac laptops contain an array of microphones and will use the integrated signal processors and ML accelerators to cut out ambient noise, feedback and all that other stuff. That’s why you can even use it for a videocall in a busy cafe and your voice will probably still arrive loud and clear.
Everyone here seems to be explaining from first principles, without saying the actual term for this: “echo cancellation”.
Echo cancellation is just the usual term for any system that has both an audio output and input and wants to avoid feedback by comparing the input with it’s own output and removing it if detected.
Lots of platforms actually support this at the OS level, particularly mobile devices and laptops that have built in speakers and microphones. Even when there isn’t OS support, there are libraries (like WebRTC) commonly used by browsers and standalone conference apps that do it in software.
the computer simply KNOWS that it is outputing so it just subtracts what it is outputing from what it is trying to input to the other person speaking to you
it’s like knowing that you’re already running towards something so instead of slapping the fuck out of their hand as you normally would while standing, you just hold your hand in the air and wait to run up next to them
Latest Answers