Why do servers “go down” for routine maintenance? Is there not a backup where traffic can route?

273 views

Why do servers “go down” for routine maintenance? Is there not a backup where traffic can route?

In: 6

8 Answers

Anonymous 0 Comments

TLDR: Backup systems cost money… a lot of money

That really depends on how your build your system

Doing rolling updates or zero downtime changes is possible if you have a system built accordingly, but not everyone can afford that nor is it always feasible.

Large scale clusters like Google, Reddit, Ebay, etc do this. They update systems in such a way that only a handful of servers are down at a time so that the site can maintain it’s uptime.

Smaller companies that only have 1 server can’t afford to increase their costs 10x fold to have this kind of redundancy.

Sometimes systems aren’t designed with that in mind and it becomes increasingly difficult to add that functionality in later on when the system grows.

While larger systems often have to perform controlled shutdowns because they have a single point of failure like a database. When the database has to be taken offline to be updated you have to take down everything.

Anonymous 0 Comments

Not always. Some companies are big enough that they take servers down all the time for maintenance and you’d never go. For example, how often does Google go down for maintenance? never. But they have thousands and thousands of servers.

Other companies don’t have that type of infrastructure. Or have some back end that can’t be accessed while it’s having maintenance. In that case, they would go down for maintenance.

Anonymous 0 Comments

Money. It is expensive to set up and maintain redundant systems that can run with some down for maintenance.

Anonymous 0 Comments

It’s totally possible to do things that way, but it’s not usually *necessary*.

As long as the site isn’t providing a super essential service, they can just schedule the downtime for a time when the least amount of users will be impacted.

Anonymous 0 Comments

The infrastructure, both hardware and human, for high availability is expensive and it gets more so the more uptime you try to commit to. Being up 99.9% is massively cheaper than 99.99% is massively cheaper than 99.999 and when you get into something with quad 9s like say, Visa, the cost is astronomical.

For most companies, going down briefly every now and then really isn’t a big deal and they don’t bother. Their users will check back in a bit with little or no ill will or negative impact. For uptime critical companies, like say amazon, google, facebook or Visa, they pour huge resources into preventing any such gaps.

There’s more nuance, some changes are easier to do with failover (running thr site or service on a backup while the main is upgraded and then upgrading the backup once the main is back online) than others and many don’t require any downtime at all. But the jist is that for a lot of things it just isn’t worth the cost.

Anonymous 0 Comments

It depends on the company, how critical the server is, and what resources are available. Where I work, for example, space on the virtual host is at a premium, so unless we absolutely have to we are not going to stand up a backup server while we preform maintenance on a server. We opt instead to do our maintenance in off hours when not many people will be using the servers.

Anonymous 0 Comments

Sometimes is not the server itself who has to get maintenance, but the content itself, so the company needs to stop people from using it.

Anonymous 0 Comments

Usually a website has a database with all the important information about users and everything you see on the site (posts / comments / videos / ads for a social media website, or name / address / accounts / debit cards / transactions for a banking website).

Having multiple copies gives you a backup, but it also raises the possibility the website gets “confused” if the multiple databases have different data.

So to make it simpler, they often only have one copy of the database, and take it offline. They upgrade the software, and maybe run some special software to convert database records from an old format to a new format (called “database migration”).

Things get really confusing if your software has to be coded to be dealing with changes simultaneously being made by an older / newer version of itself, or database records that could be in either an old or new format.

Some companies like Google pay enormous amounts of money to have large, high-skill engineering teams, but a lot of other companies are unwilling or unable to accept the costs, headaches and risks.

Even if you’re willing to do it, it’s often not really even possible if your software wasn’t originally designed to be upgraded while online. Basically not just the software, but your entire company needs to build from the ground up with a mentality of “we’re online 24/7” and be willing / able to write its own software when other companies’ software doesn’t meet those standards.