主成分分析

From Ufldl

Jump to: navigation, search
Line 1: Line 1:
主成分分析
主成分分析
-
 
+
== Introduction ==
-
== Introduction ==
+
-
【原文】
+
Principal Components Analysis (PCA) is a dimensionality reduction algorithm
Principal Components Analysis (PCA) is a dimensionality reduction algorithm
that can be used to significantly speed up your unsupervised feature learning
that can be used to significantly speed up your unsupervised feature learning
Line 17: Line 15:
a much lower dimensional one, while incurring very little error.
a much lower dimensional one, while incurring very little error.
-
【原文】 == Example and Mathematical Background ==
+
== Example and Mathematical Background ==
For our running example, we will use a dataset  
For our running example, we will use a dataset  
Line 78: Line 76:
Similarly, <math>\textstyle u_2^Tx</math> is the magnitude of <math>\textstyle x</math> projected onto the vector <math>\textstyle u_2</math>.
Similarly, <math>\textstyle u_2^Tx</math> is the magnitude of <math>\textstyle x</math> projected onto the vector <math>\textstyle u_2</math>.
-
【原文】 == Rotating the Data ==
+
== Rotating the Data ==
Thus, we can represent <math>\textstyle x</math> in the <math>\textstyle (u_1, u_2)</math>-basis by computing
Thus, we can represent <math>\textstyle x</math> in the <math>\textstyle (u_1, u_2)</math>-basis by computing
Line 105: Line 103:
because <math>\textstyle U x_{\rm rot} =  UU^T x = x</math>.
because <math>\textstyle U x_{\rm rot} =  UU^T x = x</math>.
-
【原文】 == Reducing the Data Dimension ==
+
== Reducing the Data Dimension ==
We see that the principal direction of variation of the data is the first
We see that the principal direction of variation of the data is the first
Line 164: Line 162:
do this, we also say that we are "retaining the top <math>\textstyle k</math> PCA (or principal) components."
do this, we also say that we are "retaining the top <math>\textstyle k</math> PCA (or principal) components."
-
【原文】 == Recovering an Approximation of the Data ==
+
== Recovering an Approximation of the Data ==
Now, <math>\textstyle \tilde{x} \in \Re^k</math> is a lower-dimensional, "compressed" representation
Now, <math>\textstyle \tilde{x} \in \Re^k</math> is a lower-dimensional, "compressed" representation
Line 195: Line 193:
introducing very little approximation error.
introducing very little approximation error.
-
【原文】 == Number of components to retain ==
+
== Number of components to retain ==
How do we set <math>\textstyle k</math>; i.e., how many PCA components should we retain?  In our
How do we set <math>\textstyle k</math>; i.e., how many PCA components should we retain?  In our
Line 244: Line 242:
that you retained 120 (or whatever other number of) components.
that you retained 120 (or whatever other number of) components.
-
【原文】 == PCA on Images ==
+
== PCA on Images ==
For PCA to work, usually we want each of the features <math>\textstyle x_1, x_2, \ldots, x_n</math>
For PCA to work, usually we want each of the features <math>\textstyle x_1, x_2, \ldots, x_n</math>
to have a similar range of values to the others (and to have a mean close to
to have a similar range of values to the others (and to have a mean close to

Revision as of 04:06, 11 March 2013

Personal tools