How does principal component analysis work?


How does principal component analysis work?

In: 3

If you data shows strong correlations in directions that aren’t parallel to one of the axis, then you can use PCA to rotate the data set vs a new axis that is parallel to the direction of maximum variation.

Imagine you have a dataset on the XY plane where the points generally occupy an ellipsoid shape at 30 degrees to the horizontal axis. With PCA you use the covariances to rotate the dataset such that the major axis of the ellipsoid is parallel to a new axis. This allows you to isolate the greatest variance which generally contains the greatest information (low variation doesn’t have as much information since the data is closer to some know value).

You can repeat the process to look for variances along the minor axis, but there is less information in this because of the reduced variance.

Datasets of higher orders will have a many new axis as their dimensions. As you look at each dimension with decreasing variance there is less and less data. In many cases there isn’t much useful information after the first few dimensions, and the remaining can be ignored with little loss of information.