How does reverse image lookup work?



How does reverse image lookup work?

In: Technology

When you type something into Google, Google doesn’t really care that it was words that you typed in. It converts that into digital information, and then uses that information to search for similar information, and then it is translated back into words. With reverse image lookup, it takes that image and converts it in two digital information. Then it searches all over for the same or similar digital information, before converting that information into images again.

There’s not a single way but the general idea is to first downscale the image to something small (like 32×32) and then use an algorithm that detects disctintive features of that image and encodes them into an even smaller value, it’s like some sort of extreme compression. Comparing this encoded values is much quicker since they’re much smaller than the original image.

Also the search engine is continually looking for images in the web in order to keep a database of encoded images and the corresponding addresses. So when you give it an image for a reverse lookup the engine encodes the image in the same way, then looks for similar values in the database and finally obtains the images from the stored addresses.

There’s a paper on the subject: [Fast Multiresolution Image Querying](, by Charles E. Jacobs, Adam Finkelstein & David H. Salesin in the 1995 Siggraph proceedings. I implemented this from the paper to manage and sort my own image collection, and later worked on it at Google as my 20% project. I don’t think any of my own work actually went into Google’s reverse image search, so I can’t say if this is actually how they do it.

The gist of it is this: you scale the image down to a reasonable size, and then use a Haar Wavelet transform on it. This is similar to a Fourier transform, but easier to compute. The transform returns an array of values that basically represent the components of the image in the spatial frequency domain.

You pick out the 40 most significant values, truncate them to save space, and essentially construct a “signature” of the image. The rest of the algorithm is just a fast way to match the signature to the signatures of images in your database.