What are eigenvalues and eigenvectors, and how are they used in Principal Component Analysis?

866 views

What are eigenvalues and eigenvectors, and how are they used in Principal Component Analysis?

In: Mathematics

4 Answers

Anonymous 0 Comments

First, let’s talk about a simple example that everyone knows about – maps.

A map basically has two directions, called “latitude” and “longitude”. Each of those directions has an eigenvector. The standard eigenvector for “latitude” is called “1 degree North”; the standard eigenvector for “longitude” is called “1 degree West” (although you could also use “East” as your eigenvector, which would just multiply all your longitudes by -1).

So when you say “17.87 degrees North, 34.72 degrees West”, North and West are your eigenvectors, and 17.87 and 34.72 are your eigenvalues.

Now, if you needed a third dimension, you’d add a third eigenvector. Standard in the US is to use “1 foot elevation” as the eigenvector; everywhere else uses “1 meter elevation”. So if something is at 3,400 meters elevation @ 17.87 N x 34.72 W, then your eigenvectors are “1 meter elevation”, “1 degree North”, and “1 degree West”; and your eigenvalues are 3400, 17.87, 34.72.

That’s all well and good for maps, but sometimes you’re recording something that has a few thousand different variables, and you have no idea how they interact with each other. Maybe all thousand variables are important, or maybe a bunch of them are actually just the same thing with different factors. But once you do statistical analysis on them, you break it down to a set of Principle Components, which are a smaller set of variables which can uniquely identify anything in the data set.

If it turns out that the data set has only 8 eigenvectors, then that basically means that even though it has a few thousand variables, only eight of them “really matter”, and the rest are just various combinations of those eight.

Okay, but WHICH eight?

Principle Component Analysis won’t tell you that; ain’t its job. But it *will* tell you how they all interrelate – such that, if you pick a particular set of 8 that happen to “add up to” all the others, it will tell you how they add up to get you the other few thousand numbers.

So each of those eight that you pick will be an “eigenvector” (assuming you pick eight that actually reach the whole set, and that eight is the number you need to count to, no more, no less. Nine shalt thou not count, neither seven, excepting that thou proceed to eight. Anyway.

The point is, you can sort of see it as a “dimensional space”, where if all the thousand variables collapse down to an 8-dimensional structure, Principle Component Analysis will give you eight “dimensions” that can uniquely navigate through the space, and will give you, for each of your thousand variables, how a particular set of eight Eigenvalues need to be multiplied together to produce the values for each of those thousand variables.

I hope that made sense? It’s kinda mathy.

You are viewing 1 out of 4 answers, click here to view all answers.