You phone encodes sound that needs to be played for a very small period of time, dozens of milliseconds. Then it really quickly send it to earphones, typically in dozens to hundreds of microseconds. Then earphones start playing it, but as the process of buffering some data and encoding/sending/decoding is smaller than a chunk of sound to play, you don’t notice the delay.
It’s easier for music, but a little bit more complicated for a speech. They use smaller chunks of sound to the encode it, and more aggressive codec (which is kind of similar to archiver algos), so delay keeps below 50ms. Specific high-compression algos are also the reason, why music sounds much better, than speech in BT headphones (you can dive into details by reading aptX vs SBC).
Latest Answers