Eli5: what happens on the server when one selects to watch a movie on Netflix and how the architecture allows 500k other people to watch the same movie almost concurrently?

658 views

Eli5: what happens on the server when one selects to watch a movie on Netflix and how the architecture allows 500k other people to watch the same movie almost concurrently?

In: Technology

19 Answers

Anonymous 0 Comments

So far I don’t see any really stellar explanations… so here’s my swing. First I’ll get into some tech details and then compare those to more ELi5 type learning so those reading can grasp both versions of the answer.

Most streaming technology is based upon UDP connections to servers running on some sort of cloud based infrastructure that serve up chunks of files based upon requests coming in from end clients. Most of these concurrent multicast bit stream connections are without handshake and without error check (mainly done for the sake of speed/throughput), and are supplied over massive bandwidth networks that form together their internal cloud based CDN. Netflix is no different and a lot of this tech was actually created by the MLB (fun fact, yes, Major League Baseball) long before there was the end of Blockbuster.

Now, to rephrase this to fit into a ELi5:

Most streaming apps like Netflix use technology that can be likened to using a string-can phone (the string being the UDP connection).

Imagine your home has a string-can phone, just like you would have had when you were 5. Instead of the other end of this can going to your neighbors (and best friend Timmys) house, it connects to a physical Netflix server located somewhere in an Amazon owned warehouse (referred to as “the cloud” or AWS).

You can pick up the can at any time, and ask whoever’s listening on the other side to hear/see/stream some content to you (aka a “request” from you, the end client).

When you do that, the other ends starts sending you what you requested over the connection but instead of hearing some muffled audio, your TV/watching device interprets the data being sent and displays it as video content on your screen.

What’s being sent isn’t the whole movie/show/etc but a chunk of it every few seconds based upon where you are in the timeline of said content (a bit stream). Part of what allows Netflix to operate is the streaming protocols that were first established (and then matured) by an internal team at the MLB (if I remember correctly) in the late 90’s, and without those capabilities streaming wouldn’t exist in general.

Each endpoint (the other side of the can) can connect to hundreds/thousands/tens of thousands of users simultaneously and their limit is know and usage is tracked in real-time by Netflix. As the usage increases (like during the pandemic or holidays) content providers like Netflix scale up the number of endpoints to make sure they can keep up with the demand in a particular region.

You can think of endpoints like a virtual school library, filled with information (aka videos) that takes the requests from the string-can connections. When the amount of connections reaches a particular saturation, the whole library is digitally copied to a new location and can now serve more connections. Rinse and repeat as needed.

The system itself is very complex and quite robust when you step back and look at it from afar. Part of the beauty of building something like that is making the tech transparent to end users- what they see is it just working to expectation.

If I’ve missed something significant, or if anyone wants more details, please feel free to let me know.

You are viewing 1 out of 19 answers, click here to view all answers.