How does voice over wire or over internet work? How can we hear and understand another person in real time over such a distance via a digital transmission?


Honest question I cannot wrap my mind around. How the hell does this work?

In: Physics

From end to end: speaking into microphone, modulation into digital, electronic signals (1s and 0s), transmitter sending 1s and 0s, picked up by receiver, receiver “translates” 1s and 0s back into identifiable audio, audio played by speaker.

Voice is sound. Sound is a moving wave of pressure, with function to time. That means, one can approximately represent a sound digitally, by measuring their intensity for every time interval.

So now you have sound data, but represented in numbers (these numbers are represented in ones and zeroes too). There are a lot of numbers, 48000 in one second, assuming 48 kHz audio. But no worries, internet bandwidth is far beyond enough for this amount of data.

On the other end, simply convert back these numbers to voltages. High numbers meaning high voltage, and vice versa. These voltages get pumped through a speaker cone, which moves proportionally to the voltage, which moves air proportionally as well, which then generates the sound you can hear.

Ooh voltage=sound helped my brain a bit. I’ve asked this question before. Have a friend in sonar tech I honestly asked to break it down for me. And still didn’t understand it. Voltage is a nice word, I like it and it’s helping. I would like to understand literally though. How the hell is the voice transferred over wire. I do not hear data, I hear the actual voice, and in real time, how is it so fast? Smarter brains than me for sure.

Sound is basically oscillating air. A microphone turns oscillating air into an oscillating electric field (same basic thing as the power in your home). This makes electrons move back and forth in a wire. The electrons don’t move very quickly, but the electric field travels down the wire at around the speed of light, reaching the other end in a fraction of a second. There, a speaker does the opposite job of the microphone, turning the field back to oscillating air, hence sound.

That’s the most basic telephone. Air shakes on one end, then electrons shake in a wire, then air shakes on the other end. Real phone systems involve switching (so you can dial a number) and multiplexing (so that more than one signal can share the same wire).

In digital communications, the electric field is converted into numbers describing it ([ADC](, then the data is transmitted, and it’s later turned back to a field ([DAC](

I don’t think you’re asking how data is transmitted over the Internet, and even if you are, that’s a bit much for an answer in this subreddit.

Sound is a compression wave, like pushing and pulling air molecules. Those pushes and pulls of air create high and low pressure against a tiny diaphragm (Im sure you’ve seen a speaker 🔈). When that cone moves against a magnet, it creates a voltage.

Now imagine that voltage wave, which goes up and down like an ocean wave, is sampled – thousands of times per second you measure the height of the wave and record it as a digital number.

We’ve found creative ways to make that height measurement data smaller so there’s less to move, and therefore faster to send. But then that data is sent a lot like Morse code – push the button down, 1. Let up, 0. Millions (not a typo) of times per second.

The receiving end accepts all those wave height measurements and knows the time intervals they were taken, and reconstructs an analog voltage, which then is connected to a speaker coil and moves that coil back and forth to create air pressure waves in the air.