What makes you sound like a robot when a phone or video call is lagging?

22 viewsOtherTechnology

I get that it’s connection issues, but why the “robot voice” and not static or something?

In: Technology

2 Answers

Anonymous 0 Comments

It’s the difference between “analog” and “digital” technology.

In “analog” technology, the sound waves are converted to some other kind of wave-like energy (usually radio waves) before being sent over the air or wire. The thing that receives that signal converts the wave-like energy back into sound waves. A neat thing about this is most wave-like energy (OK I’m just saying “radio” from now on) exists naturally. The receiver has to do work to separate the signal it wants from that “background noise”. But that also means if the signal stops for some reason, the equipment is now trying to convert the “background noise” to sound. Usually the result is static, because background noise is fairly random.

Imagine it like a simple noise cancelling microphone. Suppose its guts just try to pick the loudest noise the mic is picking up and get rid of all the other noise. That works fine when a person is talking. But when the person stops, the loudest noise becomes something like a fan or the room’s air conditioner, so the mic will start sending that along.

In “digital” technology, the sound waves are “sampled” and converted into digital data. You can consider that data to be just kind of like “how strong was the sound at this moment?” That “sample data” is sent along the wire/over the air, then converted back to sound waves at the other side.

But what happens when that data stream gets interrupted is different. With analog data, there’s ALWAYS some kind of background noise that may get picked up as static. But when a digital audio stream stops it’s like the equipment is receiving “there was no audio at this moment”. No audio sounds like silence.

Sometimes the way that works is just SOME of the data is lost. So like, there might have supposed to have been 44,000 samples of digital data for 1 second of speech, but only 28,000 of them arrived. That means all the equipment can do is create sound waves in the moments it has data and silence in the moments it doesn’t. That tends to make a person’s voice stutter or sound “robotic”. If we were able to draw the sound waves on paper you’d see they look very jerky and pointy instead of looking like nice, smooth waves.

That doesn’t tend to happen the same for analog data. Digital signals can “lose” data, that’s best thought of as a period where there’s no signal at all. Analog signals are more likely to just be weaker. That just makes them quieter. If an analog signal gets weaker than the background noise, the equipment will “miss” the signal and just play the static. But that doesn’t tend to interfere with the moment-to-moment reproduction of the sound wave as it does with digital data, it’s usually more constant.

Anonymous 0 Comments

Lossy compression. In order to save bandwidth, the transmitting computer converts the speech into a bunch of tones, cuts out those which matter the least, and sends the instructions for how to play the remaining ones to the receiving computer. This normally works amazingly well, and your voice can be transmitted using little data. But if the connection is slow, the compression is increased by cutting out more and more significant tones, making the voice sound more and more robotic. And if the connection drops completely for a moment, the computer may mask this by continuing to play the last tones it received until new ones are available, making for a droning buzz of the last syllable received. It’s often a better alternative than silence for short stutters.