Eli5: Why is it so hard for social media companies to remove spam and Bot accounts?


Eli5: Why is it so hard for social media companies to remove spam and Bot accounts?

In: 20

Not really hard, but consumes resources, and the spurious accounts can be included in totals to report to investors.

In addition to some good technical explanations that I’m sure this thread will get, another reason is that they don’t want to.

Bot accounts are generally good for social media companies (up until the point it becomes too obvious that they’re bots). They make it appear as though the company is more popular, which is perhaps the most important metric of a social media company.

It means celebrities and influencers have more followers. It means advertisers have more reason to pay for ads. It means in general, there is more activity on that particular social network.

From my understanding a lot of bots sole purpose is to just download a tweet or whatever the second it happens so if the person deletes it there is still a record.
And a few years down the road tweets can be used for background, perspective, truth and or dishonesty. Every media outlet and thousands of individuals do this. Not really a bad thing

It is important to remember that the volume of traffic on modern websites is so large that tasks like “remove bot accounts” must be done by automated systems.

The thing is, how exactly do you identify what accounts are bots? It might be easy enough for a human to tell but can you specify a series of unambiguous logical rules to flawlessly identify a bot vs. a human? If you start banning real users in error you can cause big problems with your business.

It is a harder task than you might think. You can’t just search for certain words because nothing is really unique to spam. Repeating the same message in several places might be a clue but you are just as likely to catch Aunt Betty posting the same Bible verse to several groups, or trigger on someone’s signature.

Remember that it is also a continual competition between the spammers and the providers. As soon as they find a filter that works the spammers are off finding a way around it. They can add randomized garbage to their posts so it isn’t easily recognized as the same message. They can even just copy parts of other posts to blend in.

They are trying to find a needle in a haystack and the needles are actively hiding.


They have very little incentive to do so. They make money by selling ads, and they sell ad space by showing advertisers their metrics. Everything about bot accounts pumps up their metrics, so if they were to successfully purge them (performing the majority of which isn’t as difficult as you might think), their stats would collapse.

Everyone has probably heard of that one instance when Fb claimed they had reached more teenagers in the US than there existed teenagers in the US? They probably didn’t even invent the stat, with all the bot action, they probably did see that much engagement. That’s the real issue – when companies start drinking their own koolaid.

They don’t want do. Plain and simple. The bots create traffic, and most are fairly ineffectual. The ones that aren’t create even more traffic and in some cases do get removed.

There is considerable overlap between the dumbest humans and the most advanced bots.

Bots are designed to act like humans just enough to make them blend in. Creating an algorithm to detect them automatically is difficult, because you risk banning actual users.

Most don’t. Bots expand their numbers of users.

Some people make accounts that look fake. Banning them can alienate your user base

Elon Musk asked this very question at a public venture capital forum the other day. It was somewhat rhetorical in his case, since he provided the answer that good programming and data analysis can detect and remove bots, but it is also a very practical question for him, since his preliminary analysis of Twitter says that 20% to 50% of the accounts are bots.

Twitter claims the number is around 5%, but he has already ruled that number out as too low. Twitter being 90% bots has not yet been ruled out.

Ideally you authenticate each account by doing something like
– Get a phone number
– Send them a text with a confirmation code or pin number
– Require the PIN to be returned via Twitter, within 5 or 30 minutes.
– Cross-check the list of phone numbers so that multiple accounts cannot give you the same phone number.
– The country the applicant claims to be from should also match the country code of the phone number.

There are other checks that can be done. One of the ones I like best is to give the applicant a dollar. It is unlikely that a bot would have a bank account, or paypal account, or some other unique means of receiving a payment.