What are eigenvalues and eigenvectors, and how are they used in Principal Component Analysis?

187 views

[ad_1]

What are eigenvalues and eigenvectors, and how are they used in Principal Component Analysis?

In: Mathematics
[ad_2]

If I remember right, an eigenvector is a vector that is immune to a matrix.

First, you have to think of a matrix as instructions for a transformation. A matrix can tell you to take your space (be it a number line, a cartesian plane, or any other n-dimensional space) and rotate it, squish it, make it wobbly, etc.

When you apply this transformation, any point in your original space will become a different point in your new space.

Any vector that looks the same in the new space as it did in the original space, as in any vector that is not affected by the transformation of a matrix, is an eigenvector of that matrix.

For example if a matrix tells you to stretch a cartesian plane horizontally, such that the horizontal unit vector becomes twice as long, then any vertical vector is an eigenvector of that matrix, because it has no horizontal component and thus is not changed by the transformation.

I’m afraid I don’t know much about the inner workings of PCA.

You might be better off reposting to r/askscience. I don’t think theres any way to explain this like you’re 5.

First, let’s talk about a simple example that everyone knows about – maps.

A map basically has two directions, called “latitude” and “longitude”. Each of those directions has an eigenvector. The standard eigenvector for “latitude” is called “1 degree North”; the standard eigenvector for “longitude” is called “1 degree West” (although you could also use “East” as your eigenvector, which would just multiply all your longitudes by -1).

​

So when you say “17.87 degrees North, 34.72 degrees West”, North and West are your eigenvectors, and 17.87 and 34.72 are your eigenvalues.

Now, if you needed a third dimension, you’d add a third eigenvector. Standard in the US is to use “1 foot elevation” as the eigenvector; everywhere else uses “1 meter elevation”. So if something is at 3,400 meters elevation @ 17.87 N x 34.72 W, then your eigenvectors are “1 meter elevation”, “1 degree North”, and “1 degree West”; and your eigenvalues are 3400, 17.87, 34.72.

That’s all well and good for maps, but sometimes you’re recording something that has a few thousand different variables, and you have no idea how they interact with each other. Maybe all thousand variables are important, or maybe a bunch of them are actually just the same thing with different factors. But once you do statistical analysis on them, you break it down to a set of Principle Components, which are a smaller set of variables which can uniquely identify anything in the data set.

If it turns out that the data set has only 8 eigenvectors, then that basically means that even though it has a few thousand variables, only eight of them “really matter”, and the rest are just various combinations of those eight.

Okay, but WHICH eight?

Principle Component Analysis won’t tell you that; ain’t its job. But it *will* tell you how they all interrelate – such that, if you pick a particular set of 8 that happen to “add up to” all the others, it will tell you how they add up to get you the other few thousand numbers.

So each of those eight that you pick will be an “eigenvector” (assuming you pick eight that actually reach the whole set, and that eight is the number you need to count to, no more, no less. Nine shalt thou not count, neither seven, excepting that thou proceed to eight. Anyway.

The point is, you can sort of see it as a “dimensional space”, where if all the thousand variables collapse down to an 8-dimensional structure, Principle Component Analysis will give you eight “dimensions” that can uniquely navigate through the space, and will give you, for each of your thousand variables, how a particular set of eight Eigenvalues need to be multiplied together to produce the values for each of those thousand variables.

I hope that made sense? It’s kinda mathy.

PCA is a way of simplifying information without losing too much.

Let’s say you’re trying to buy a house. Each property has a ton of factors at play – a price, a location, a size, number of bathrooms, etc.

How should you think about all of these numbers? Well maybe a lot of them are just capturing a few underlying things. The cost and size and bathrooms are all basically about value. Maybe you can get a ‘value score’ for each house by adding up 1% of the price, the number of bathrooms and 10% of the square footage.

Now instead of three numbers, you just have one combined value metric. This is what PCA does. It helps you find the most interesting ‘scores’ you can give that will capture the ways that your properties vary.

An eigenvector in PCA is just the way a score is calculated. In my example it would be .01*price, 1*baths, .1*size. The eigenvalue is the score itself. Maybe a house gets a 1600 – that’s the eigenvalue.

If you’re a visual thinker this all shakes out nicely in spatial terms. Imagine plotting price, size and baths in 3D. If they’re very correlated, most of your properties will lie along a rough line diagonal to all of those axes (they all go up and down together). The diagonal axis they’re spread along is the eigenvector. PCA just rotates your axes so that you can read the distance along this compound direction more easily.