principal component analysis

177 views

principal component analysis

In: 1

2 Answers

Anonymous 0 Comments

You collect data on the age and size of children. These two quantities are correlated; older children tend to be taller. So if you make a scatter plot of age against size, you’ll see a cloud of points roughly following an upright diagonal. PCA is a mathematical operation to rotate the coordinate system such that the new x-axis runs directly along that diagonal. Now you have a lot of variance along the x-axis, and less along the y-axis.

For two dimensions, this isn’t tremendously important, but if you have highly correlated data in a lot of dimensions (say, hundreds of genetic markers, or spectral data sampled at hundreds of wavelengths), PCA allows you to rotate the coordinate system such that you can plot the data in the first two dimensions only, and still lose as little information as possible.

You are viewing 1 out of 2 answers, click here to view all answers.