PCA
From Ufldl
(→Example and Mathematical Background) |
|||
Line 50: | Line 50: | ||
It can then be shown that <math>\textstyle u_1</math>---the principal direction of variation of the data---is | It can then be shown that <math>\textstyle u_1</math>---the principal direction of variation of the data---is | ||
the top (principal) eigenvector of <math>\textstyle \Sigma</math>, and <math>\textstyle u_2</math> is | the top (principal) eigenvector of <math>\textstyle \Sigma</math>, and <math>\textstyle u_2</math> is | ||
- | the second eigenvector. | + | the second eigenvector. |
- | of this, see the CS229 lecture notes on PCA. | + | |
- | software to find these (see Implementation Notes). | + | '''(Important: For a mathematical derivation/formal justification of this, see the CS229 lecture notes on PCA, linked below.)''' |
+ | |||
+ | You can use standard numerical linear algebra software to find these eigenvectors (see Implementation Notes). | ||
Concretely, let us compute the eigenvectors of <math>\textstyle \Sigma</math>, and stack | Concretely, let us compute the eigenvectors of <math>\textstyle \Sigma</math>, and stack | ||
the eigenvectors in columns to form the matrix <math>\textstyle U</math>: | the eigenvectors in columns to form the matrix <math>\textstyle U</math>: | ||
Line 192: | Line 194: | ||
to the original, and using PCA this way can significantly speed up your algorithm while | to the original, and using PCA this way can significantly speed up your algorithm while | ||
introducing very little approximation error. | introducing very little approximation error. | ||
+ | |||
+ | == References == | ||
+ | http://cs229.stanford.edu |