Modern cellular networks, as well as data only networks like Wifi, use a technique called “Time Division Multiple Access”, or TDMA to connect to lots of phones at the same time. Basically, the tower talks to each phone, one at time, and switches which phone it is talking to thousands of times per second. So your phone may get 1ms of communication time every 50ms. During that 1ms, your phone gets the last 50ms of audio data, which it plays back for your ear while waiting for its next turn to receive data. This means there is a slight delay in the sound of a phone call, and an added latency for data transmission compared to a wired network.
Each cell tower can only divide its time up between so many different clients, usually about 100 per cell. In order to have more than 100 clients, you need to have multiple cell base stations in the same area.
There are several techniques to do this, and different cellular standards (3G, 4G, 5G etc) use different combinations.
Most of them are analogous to the problem of having a bunch of people in a room trying to talk.
Time division: everyone takes turns talking. Note that for mobile phones, we can send a large amount of data in a very short turn, like one 1ms.
Frequency division: different people speak at different pitches. The tower has multiple listeners who only hear one voice pitch.
Beam-forming: Like putting your hands around your mouth or ears to speak or listen in one particular direction, to better hear just one person in a crowded room.
Code division: It’s a bit like different people speaking different languages. However the languages are specifically designed, mathematically, to be distinguishable when spoken over top of one another (unlike real natural languages).
These techniques can be combined to allow towers to serve many devices simultaneously. The other thing to note is that in dense urban areas, they deploy a lot of towers and everyone speaks at a lower volume to a closer tower.
Latest Answers