Eli5: how an email spam/junk filter works

154 viewsOtherTechnology

After receiving daily junk emails like “7 Facts you need to know about X” and “this top 10 list will SHOCK you,” despite the fact I block each sender and have attempted “unsubscribing” which doesn’t work, I have to wonder why these emails don’t get filtered as junk but other emails do.

In: Technology

Anonymous 0 Comments

They used to. That they don’t now is a good indicator of just how unimpressive AI tech is. Some of it is because of a direct conflict of interest.

The way spam filtering worked around when GMail released was more or less how so-called AIs like ChatGPT form their sentences. Every time you mark an email as Spam, it looks at all the words inside and what words are next to them. It’s also doing that with the email you don’t mark with spam. After a while, it gets the idea “if these words are around each other you usually mark this mail as spam”.

The benefit of GMail when it came out was since it got to look at EVERYONE’S email, it had a VERY good idea of what people marked as spam even before individuals saw certain kinds of spam.

So what happened? Well, a few things.

Spam was always about money. They are usually trying to either steal your money or sell you something. When Google released GMail they were a search engine business that made money selling banner ads that were less annoying than everyone else’s banner ads so people were more likely to deal with them instead of finding an ad blocker. Now Google is an advertising company that provides products for free that serve you ads. That’s a problem. Now they take a lot of what used to go in your spam folder and put it in a “Promotions” tab. That’s usually still things you honestly signed up for. But it also usually includes a handful of items you *didn’t* subscribe to that someone paid them to put in front of you. In short: that is spam Google was paid to put in your GMail. You can’t make them stop.

But that wasn’t the only thing that happened. Remember how the spam filtering works. It looks at words people mark “spam” and “not spam” and tries to reason which ones mean which. So spammers caught wise and started signing up for hundreds of accounts to sign up for their own things and mark them as “not spam”. And they’ll mark competing emails as “spam”. This confuses the algorithms into miscategorizing mails.

So we’re left in a state where the only thing you can do is habitually add people to contacts or set up filters so the things you want don’t ever get moved to spam. And there’s not much you can do about the things that get through other than mark them as spam.

Again, it’s worth noting this is just a slightly less sophisticated version of how modern “AI” tools work. The more they replace human moderators, the more they’ll start to face the same people who have ruined email filtering. Google’s one of the companies trying to *sell* AI, so you’d think they’d hold up GMail’s filtering as an example of how smart they are, right? Right?

You are viewing 1 out of 1 answers, click here to view all answers.