I always wonder how a server like google server receive thousands of request from people at once! I understand the signals are passed through fibernet cable, but won’t those signals collide ? Or how can some small wire handle soo many requests at once ? iin my mind a wire can only take one request at a time. I know these happens close to light speed, but still! its very hard to understand.
In: Technology
1 CPU core can do 1 operation at a time. If a server gets a request it handles the request. That request could mean fetching data from a database on another server that takes time to go across the network, leaving that 1 core idle until it gets the data back. That makes it available to handle another request while the first one is suspended and this keeps happening over and over. As a result, a request can take 500ms but only consume 1ms of actual compute time on the core. This is called concurrency and why a server can handle 1000 requests in 1000ms or 1 second.
However, at some point you reach a limit and requests end up queuing up in memory until it eventually crashes the system with 0 CPU and 0 memory available. This is where you can either add cores and make the server bigger, or scale out with multiple identical servers. Having multiple servers means you need another server layer to route traffic to multiple instances called a load balancer. There are many types of load balancer strategies, but the simplest is round robin where each request is routed to the next server in line. Those servers could also have multiple cores, so you could have 2 servers with 4 cores each or 8 servers with 1 core each. Regardless, you will now have 8 cores that can now handle 8000 requests per minute.
There are additional systems that handle the load management with auto scaling where if average CPU use is high, additional cores will be automatically added. This is why retailers don’t get knocked offline on Black Friday and also why running Twitter is so hard–if a post goes viral, all the servers responsible for that viral content have to scale out quickly, while also scaling in properly to not waste resources/money.
Latest Answers