How can a phone apply complex filters in real time through apps like snapchat, when similar effects applied through a compositor (tweaking colors, blurring, warping, etc.) can’t render in real time even on a mid-range PC?


How can a phone apply complex filters in real time through apps like snapchat, when similar effects applied through a compositor (tweaking colors, blurring, warping, etc.) can’t render in real time even on a mid-range PC?

In: 248

It’s a bit of an apples to oranges comparison. The computer software is just doing a lot more work than the Snapchat filter is, because it needs to ‘solve’ the general case, whereas filters only need to ‘solve’ specific cases.

When you run photo/video editing software on your computer, it needs to be able to do nearly anything. No matter what effects you pick/tweak, you still expect the computer to render it. It’s like a helicopter; it can go nearly anywhere, but if you’ve got a lot to process, it might take multiple trips.

Snapchat filters are pre-selected and pre-compiled. You can only pick from a handful of filters, but the knowledge of how to apply those effects is already built-in to the software, so most of the heavy-lifting is already done. It’s like a train; it can only go from point A to point B, but it can do a lot of work in one trip.

The only real work your phone is doing is tracking faces and then drawing over it with effects/filters. The stuff you see on screen is basically the end result of the rendering you’re talking about PCs doing so if you tried actually rendering those effects on your phone it would take a lot longer.

There’s a huge difference in quality between what Snapchat filters do and what professional image editing programs do.

PC’s can do Snapchat level filters no problem, you can find streamers who do it live.

The premise of your question is wrong. Give PCs the same input image and the same effects and they could also do it in real time.

Firstly, modern phones are surprisingly powerful. There is a reason Apple is now using their mobile phone processors in laptops instead of Intel’s “mobile” processor range…

But the main reason is a trick in the programming that speeds up the preview: while the camera takes a very high resolution image, the screen preview is actually a much lower resolution *copy* of that image. The filters are then *previewed* on that low-res image, and only once you “OK” the result are they applied to the “real” image in high res. This may even happen in the background, after you thought the process is over.

The complexity increases at a faster rate than the quality does. It takes relatively little computing power to put a sorta-ok filter on, but when it needs to be extremely detailed it increase exponentially

There are 2 types of pictures, layered and not layered. A layered picture you can imagine everything with color being on it’s on layer that you can move around similar to how cartoons and anime are made. You can move the parts around and the background stays the same. A not layered image is every layer made into one single layer where you can’t move anything without leaving a whitespace where the image was.

A video is a lot of pictures shown on a computer one after the other. 60 fps is 60 frames (pictures) per second.

In general a filter is adding another layer on top of all the images in a picture. Since there are already 20+ pictures per second you might only apply the filter on top of the image every 4-5 frames. This can be done relatively simple and cheaply for a computer resources.

However when a compositor does it, it tends to change the actual image, similar to a non layered image. This tends to be more resource intensive, as going into the actual images and changing them instead of throwing another on top is more complex. Also most editing software will try to change every image in the second instead of every 4-5 frames. Thus it has to do 4-5 times as much work.


(As a side note this is also why a while back when people where dancing naked with a black screen effect in front of it (nude silhouette TikTok challenge) people were able to remove the filter easily and see the naked woman dancing.. they just took the front layer away that the filter imposed on top)

Real time artificial intelligence in the form of convolutional neural networks perform complex image processing and are tiny enough and use sufficiently small resources to run on your phone. But they are generally specifically trained to do 1 or maybe 2 things at a time.

Mostly it’s optimization, by the software.

Cellphones internal is very tailored to this specific stuff , and this same instructions are not found in all computers CPU

There is a quality factor to consider as well.

Maybe a cellphone cpu will try to apply the filter to a face but it only has to ask the camera for the points to track ( since it’s mostly API based ) , so the Mobile app does a lot less work

A big difference is that a Snapchat filter can be precompiled and optimized, while a compositor has to be a modular system and re-run every time a change is made, therefore can’t be too heavily optimized.