If a social media platform is running smoothly, but the engineers leave, why can’t a platform continue to run on autopilot?

707 views

I guess this is applicable to any social media platform or other similar systems. Is it because there are always bugs to address, so it’s never really running smoothly, or other reasons?

In: 153

38 Answers

Anonymous 0 Comments

Cloud server can go down. Network congestion can cause problems and may need manual rerouting. A bug could throw an exception that takes down a prod service. A service could encouter unexpected behavior and may need maintenance. Autoscaling could fail. Reverse proxies may need cache resets. A soft attack could spam hella unfriendly shit. A hard attack could brick the whole company. None of these can be fully automated without a human level AI that can conquer all sorts of edge cases.

Anonymous 0 Comments

If its running smoothly and you never notice any problems? That means the engineers are doing a great job…. because there were problems., and you never saw them.

That stops when this engineers leave. If not immediately, than soon.

Anonymous 0 Comments

There are lots of good technical answers. Let’s talk commercially. Twitter operates in some ways like any traditional business. Advertisers pay money for campaigns. After a limit they expect support, including strategy / targeting etc. On the other side Twitter has various suppliers who need to be paid. Third party code, cloud computers, existing facilities etc.

Tech companies do run very lean with typical employees being valued very high. Someone is paid very well to ensure this is possible.

Anonymous 0 Comments

I would say there are people trying to hack the system and people wanting new features. investors want features that make more money.

Anonymous 0 Comments

Old windows 95 needed to be rebooted frequently. NT was amazing as it was stable enough to go a week. But even this win11 machine which I leave on all the time gets weird and wonky and I need to reboot maybe once a month. now imagine a whole server farm…

oh a friend says that with all those engineers and tech people you gotta know there’s a half dozen were running a diagnostic, or some logging ap that they would close down when they returned the next day…only they didn’t return…and those logs are filling up a half dozen hard drives…

I saw a post that Elon diverted half of Tesla’s systems people to help keep it running, I hope no sedans full of holiday travelers drives off a bridge cause no one was keeping their stuff up to snuff.

Anonymous 0 Comments

If you’re referring to Twitter than we don’t know what anyone was doing. Maybe you don’t need a huge bloated team.

Generally speaking, computers need updates to function correctly. If you don’t update the computer, you could face issues. Updating the computer could mean you have to update all your codes.

Anonymous 0 Comments

There’s a lot of fine answers, but I feel nobody has answered _why_, just how.

Here’s an example:

You have a website. It works. In theory it could run forever, since the code doesn’t change.

Reason 1: You have a small bug. Every 1 million user registrations your user registration page breaks and needs to be restarted.

Reason 2: You don’t actually handle money yourself. You have 15 different banks/companies in different countries that handle money transfers for you. Every 2-3 years they change something, due to laws in that country being changed. That means your site breaks, on average, 5 times per year.

Reason 3: The power went out. You need to push a button to start the site again.

Reason 4: Google changed their search algorithm again. If you don’t provide “the new data” nobody will ever see your site again – it’s now on page 2!!!

Reason 5: You site actually had a really really complicated security issue. Luckily somebody fixed the tools you’re using for your website, but you still have to press the button to update. If you don’t, in 3 months there’ll be an easy-to-use app called “site-breaker-kit” that just takes over your website.

Anonymous 0 Comments

There are almost certainly errors / false assumptions / bugs that would remain uncorrected. A common example: a server unintentionally designed such that if there’s too much load on it, it drops in performance, actually lowering the total load the site can handle and spreading the load to other servers, which might drop in turn, leading to a cascade failure. You need a human to diagnose and correct such a problem. Or: a database was designed in such a way that collisions are unlikely instead of impossible, and that wasn’t detected during development, and a collision happens, breaking something. Or: the site uses another site’s API to interact with it, but the other site changes its API, and now the interaction is broken.

Simple sites _can_ run on autopilot, but big sites like Twitter are usually a big mess of many international servers, load balancers, CDNs, meta servers that manage dev credentials and other servers, whatever. You need intelligent troubleshooters, or at least you need complicated troubleshooting programs robust to handling many kinds of error. The usual solution in modern webshit is “if something seems to malfunction, restart it” instead.

As for HR, moderation, lawyers, “reps”, consultants, etc.: Contrary to their self-flattering claims, those parts don’t actually matter one bit, and can go.

Anonymous 0 Comments

It actually can.

The reason you need hundreds of engineers is because you add complexity to the platform.
A set team of engineers can only support say 15 core services/functions.
With every single new line of code added you are adding complexity and possible bugs and failures. This is inevitable at current state of programming.

Companies like Twitter, google, Facebook, and other corporations tend to add hundreds and tons of unnecessary features, programs and services for various reasons.
One is – engineers always want to try something new.
Another – managers needs to showoff, etc.

So, say you have a healthy running Twitter, you want to add emojis support, you have only text now. Unfortunately, your team is at capacity.
So, you need to hire a new team. They add emoji, complexity of the system grows, but it’s OK because you have that additional team to support that. Emljis work is done, so you have two problems now:
a) additional complexity and bugs added by emoji effort that your core team cannot handle because they are at capacity
b) extra team that can maintain and operate emoji but, other than that they have nothing to do
So what do you do next?
You toss new work at the extra team, admin panel, moderator panel, dmca panel and kanji support.
Some of these features are not required and not core, but nice to have.
And you just keep adding and adding and adding.
And because your platform is cool you have money to do that.
All these teams and managers now claim that are indispensable because they support emoji and kanji and moderator panel and the platform CANNOT RUN WITHOUT IT.
Of course it can.

Now, this crazy new boss comes in and says, 80% of this stuff is bullcrap – which infact – it is.
– We don’t need all these people supporting emoji- they don’t.
Hence they are all fired.

Now, you have some people that are insecure and don’t like the politics, they also leave.
You end up with 20% of staff. You tell them to make the core work and emoji and kanji and moderator panel bugs will get lower priority.
You turn off time tracking systems and bunch of other internally developed tools, by developers who prior management didn’t know what to do with, and outsource this shit as its not your core business.

Do you know that Google has employees that change xml files for google doodle?

Do you know that an average investment bank has thousands of people in IT that take care of useless programs, services and procedures?

I mean, your company would still run just fine without all this crap.

Anonymous 0 Comments

I actually really like this question because it forces you to think about what’s really involved in a business like this.

For the most part it can but at some point, security may become an issue.

A codebase will also often have long-standing bugs, for which a workaround requiring people is usually taken until there’s a fix (which also requires people). Each fix can potentially introduce what are called regressions, and then you are back to having bugs to fix. Good codebases have strong testing frameworks to help minimize this risk. A company may choose to hire quality assurance people for testing. These people may also be developers or at least have a strong command of the space in general.

Scaling the platform can be automated to some extent using tools like Kubernetes, though it turns into a rather complex task that commands highly skilled engineers. There are some gotchas there with how to handle sensitive information as well.

You may also be doing some data analysis to best provide information to your advertisers, and that can easily involve some technical talent.

Caching (storing frequently accessed information aside for ready use) is a beast in itself. This has implications on performance. Outages are a pretty obvious need for resources.

There are also other reasons like if the platform enters a new market, the language support and will need to be introduced. Or if a greater focus on disability issues is taken, then your user interface would need to support that. If there’s a new device that is introduced (say, an iPad)
interface support for that may be desired.