it is worth considering the space being used – 15 exabytes was mentioned elsewhere in this thread. 1 exabyte = 1000 petabytes = 1,000,000 TB. So those 15TB HDD – that would be about 75,000 x 15TB drives. You can try to imagine how much volume of space this is – the data centers have them highly packed together – maybe 15 in a “tray” – for accessibility/replacement. So thats 1000 trays – the size of a large room packed floor to ceiling. But allow room for air conditioning and cables. And allow room for redundancy : even 100 hard drives are going to have a failure every few months to a few years, so some aspect of automation can kick out the failing drives, and robots or humans are needed to install new drives.
Now, because google have lots of data centers, the data is going to be replicated – maybe not to each data center, but to a number.
Layer on top of this, the YT client – people playing stuff, and the YT core, has to be able to search and locate the actual items of interest to you – not just the movie/video you are interested in, but also the metadata – adverts, the suggested related items etc.
One can imagine some form of google-search engine sitting on top of this data. A person logs in, and the search engine tries to assemble the page you are viewing, based on who you are and your preferences. The search engine will have information allowing the meta data to be fetched, and when you play something, to assign it to a server to send you the data – in chunks – you dont need the entire movie/song at the point of playing – merely enough data to keep your experience smooth.
With so much storage and computers and air conditioning and electricity, then you need a good sized number of people to monitor for failing hardware, communications blockages or outages, hack attacks and much much more.
It is impressive that it, and many other huge sites, like reddit/amazon/microsoft, can scale so well for so many people, but equally, the cost, to keep this all running, is dependent on keeping people happy. If YT did a “Twitter” and aliented all its users – those data centers, people, storage, comms lines etc are going to keep costing. So, managing this requires very carefully planned out economics planning (do we grow? do we shrink?) Welcome to cloud-compute, and why the skills are so much in demand.
Is amazon the same/different? Yes & no. Amazon doesnt have a core business to serve up user content. Instead much of their business is to house the cloud apps that we all use. The same economic issues arise – how quickly to grow/shrink, how quickly to update machines, processing hardware failures. But Amazon sells this as a service and prides itself on availability and technical functionality. Google has tried to do the same with the google Cloud platform, but not as successfully. Microsoft have also done this, but more successfully.
Managing huge amount of real estate, looking at geographic tax breaks, and, watching out for geo-political issues that could cause real harm (think: russia, middle east, china): it does not come cheap. Hence why, rivals to YT and Amazon are rare. It takes years to find and build a data center.
Latest Answers