If a social media platform is running smoothly, but the engineers leave, why can’t a platform continue to run on autopilot?


I guess this is applicable to any social media platform or other similar systems. Is it because there are always bugs to address, so it’s never really running smoothly, or other reasons?

In: 153

Because like a plane, it will run into problems naturally and from surrounding conditions, so if you don’t keep the entire thing maintained the wrong problem unchecked can completely break it apart

Operating systems update. The app needs to update along with them or they won’t work. Plus security updates are needed. At a minimum.

Who keeps it up to date with new hardware and software? The whole rest of the internet will continue to move forward. How long until their app no longer works on phones, or their website displays disjointedly on modern browsers?

What happens when some little thing goes wrong, as is often the case with computers, and nobody’s there to fix it?

A site like Twitter is not fully self contained. It uses many (probably thousands) of third party libraries. These libraries are constantly being updated for new features, security risks, stability etc.

That means you need to frequently update your app to at the very least use the new libraries. Not doing so won’t break it right away, but sooner or later (hint: usually sooner) there will be a breaking change such as an older version being deprecated, or a field name being changed, that requires you to not only update the library you tell your program to use, but to make some changes internally as well.

Plus anything running at the scale of Twitter has a whole lot of infrastructure supporting it, usually in the cloud, that requires specific types of engineers (DevOps, DevSecOps, etc).

The site is running smoothly *because* all the staff are constantly doing things. And it’s not just the engineers. Moderators are removing bad content, lawyers are responding to requests from governments, project managers are making sure projects run on time, and accounting staff are paying all the bills.

It’s like saying “this hotel is running very smoothly. Why would it matter if 80% of the staff left?” It’s the constant, almost invisible effort of the humans that keeps it going. Sure, the building isn’t going to fall down. But there’s not going to be enough staff left to wash and change the sheets, make guest keys, change the air filters, start the giant coffee pots in the morning, receive deliveries of soap, or pay the electric bill.

There’s a whole class of people called Site Reliability Engineers (SREs) whose whole job is to keep large websites working. Here’s a very fascinating thread from an experienced SRE just listing all the ways a large tech company can collapse: