主成分分析

Revision as of 21:18, 11 March 2013 (view source)

Kandeng (Talk | contribs)

m

← Older edit

Revision as of 21:25, 11 March 2013 (view source)

Kandeng (Talk | contribs)

Newer edit →

Line 8:

-

== Introduction ==

+

== Introduction 引言 ==

-

引言

【原文】：Principal Components Analysis (PCA) is a dimensionality reduction algorithm

Line 40:

Line 39:

-

== Example and Mathematical Background ==

+

== Example and Mathematical Background 实例和数学背景 ==

-

~~实例和数学背景~~

【原文】：For our running example, we will use a dataset

Line 60:

Line 58:

[[File:PCA-rawdata.png|600px]]

+

【原文】：This data has already been pre-processed so that each of the features <math>\textstyle x_1</math> and <math>\textstyle x_2</math>

Line 90:

Line 89:

[[File:PCA-u1.png | 600px]]

+

【原文】：I.e., the data varies much more in the direction <math>\textstyle u_1</math> than <math>\textstyle u_2</math>.

Line 105:

\Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T.

\end{align}</math>

+

【原文】：If <math>\textstyle x</math> has zero mean, then <math>\textstyle \Sigma</math> is exactly the covariance matrix of <math>\textstyle x</math>. (The symbol "<math>\textstyle \Sigma</math>", pronounced "Sigma", is the standard notation for denoting the covariance matrix. Unfortunately it looks just like the summation symbol, as in <math>\sum_{i=1}^n i</math>; but these are two different things.)

Line 111:

Line 112:

the top (principal) eigenvector of <math>\textstyle \Sigma</math>, and <math>\textstyle u_2</math> is

the second eigenvector.

-

【初译】：假设<math>\textstyle x</math>的均值为零，那么<math>\textstyle \Sigma</math>就是<math>\textstyle x</math>的协方差矩阵。（符号<math>\textstyle \Sigma</math>，读"Sigma"，是协方差矩阵的表示符。虽然看起来与求和符号<math>\sum_{i=1}^n i</math>比较像，但他们是两个不同的概念。）由此可以得出，数据变化的主方向<math>\textstyle u_1</math>是协方差矩阵<math>\textstyle \Sigma</math>的主特征向量，而<math>\textstyle u_2</math>是次特征向量。

Line 148:

\end{bmatrix}

\end{align}</math>

+

【原文】：Here, <math>\textstyle u_1</math> is the principal eigenvector (corresponding to the largest eigenvalue),

Line 173:

Line 174:

-

== Rotating the Data ==

+

== Rotating the Data 旋转数据 ==

+

【原文】：Thus, we can represent <math>\textstyle x</math> in the <math>\textstyle (u_1, u_2)</math>-basis by computing

Line 198:

Line 200:

[[File:PCA-rotated.png|600px]]

+

【原文】：This is the training set rotated into the <math>\textstyle u_1</math>,<math>\textstyle u_2</math> basis. In the general

Line 226:

Line 229:

-

== Reducing the Data Dimension ==

+

== Reducing the Data Dimension 数据降维 ==

+

【原文】：We see that the principal direction of variation of the data is the first

Line 241:

Line 245:

\tilde{x}^{(i)} = x_{{\rm rot},1}^{(i)} = u_1^Tx^{(i)} \in \Re.

\end{align}</math>

+

【原文】：More generally, if <math>\textstyle x \in \Re^n</math> and we want to reduce it to

Line 355:

Line 360:

-

== Recovering an Approximation of the Data ==

+

== Recovering an Approximation of the Data 数据还原 ==

+

【原文】：Now, <math>\textstyle \tilde{x} \in \Re^k</math> is a lower-dimensional, "compressed" representation

Line 375:

Line 381:

= \sum_{i=1}^k u_i \tilde{x}_i.

\end{align}</math>

+

【原文】：The final equality above comes from the definition of <math>\textstyle U</math> [[#Example and Mathematical Background|given earlier]].

Line 410:

Line 417:

-

== Number of components to retain ==

+

== Number of components to retain 选择主成分个数 ==

+

【原文】：How do we set <math>\textstyle k</math>; i.e., how many PCA components should we retain? In our

Line 507:

Line 515:

【二审】：

-

== PCA on Images ==

+

== PCA on Images 对图像数据应用PCA算法 ==

+

【原文】：For PCA to work, usually we want each of the features <math>\textstyle x_1, x_2, \ldots, x_n</math>

to have a similar range of values to the others (and to have a mean close to

Line 617:

Line 628:

【二审】：

-

== References ==

+

== References 参考文献 ==

http://cs229.stanford.edu

From Ufldl

Revision as of 21:25, 11 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 8: / Line 8: @@
-== Introduction ==
+== Introduction 引言 ==
-引言
 【原文】：Principal Components Analysis (PCA) is a dimensionality reduction algorithm
@@ Line 40: / Line 39: @@
-== Example and Mathematical Background ==
+== Example and Mathematical Background 实例和数学背景 ==
-实例和数学背景
 【原文】：For our running example, we will use a dataset
@@ Line 60: / Line 58: @@
 [[File:PCA-rawdata.png|600px]]
 【原文】：This data has already been pre-processed so that each of the features <math>\textstyle x_1</math> and <math>\textstyle x_2</math>
@@ Line 90: / Line 89: @@
 [[File:PCA-u1.png | 600px]]
 【原文】：I.e., the data varies much more in the direction <math>\textstyle u_1</math> than <math>\textstyle u_2</math>.
@@ Line 105: / Line 105: @@
 \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T.
 \end{align}</math>
 【原文】：If <math>\textstyle x</math> has zero mean, then <math>\textstyle \Sigma</math> is exactly the covariance matrix of <math>\textstyle x</math>.  (The symbol "<math>\textstyle \Sigma</math>", pronounced "Sigma", is the standard notation for denoting the covariance matrix.  Unfortunately it looks just like the summation symbol, as in <math>\sum_{i=1}^n i</math>; but these are two different things.)
@@ Line 111: / Line 112: @@
 the top (principal) eigenvector of <math>\textstyle \Sigma</math>, and <math>\textstyle u_2</math> is
 the second eigenvector.
 【初译】：假设<math>\textstyle x</math>的均值为零，那么<math>\textstyle \Sigma</math>就是<math>\textstyle x</math>的协方差矩阵。（符号<math>\textstyle \Sigma</math>，读"Sigma"，是协方差矩阵的表示符。虽然看起来与求和符号<math>\sum_{i=1}^n i</math>比较像，但他们是两个不同的概念。）由此可以得出，数据变化的主方向<math>\textstyle u_1</math>是协方差矩阵<math>\textstyle \Sigma</math>的主特征向量，而<math>\textstyle u_2</math>是次特征向量。
@@ Line 148: / Line 148: @@
 \end{bmatrix}
 \end{align}</math>
 【原文】：Here, <math>\textstyle u_1</math> is the principal eigenvector (corresponding to the largest eigenvalue),
@@ Line 173: / Line 174: @@
-== Rotating the Data ==
+== Rotating the Data 旋转数据 ==
 【原文】：Thus, we can represent <math>\textstyle x</math> in the <math>\textstyle (u_1, u_2)</math>-basis by computing
@@ Line 198: / Line 200: @@
 [[File:PCA-rotated.png|600px]]
 【原文】：This is the training set rotated into the <math>\textstyle u_1</math>,<math>\textstyle u_2</math> basis. In the general
@@ Line 226: / Line 229: @@
-== Reducing the Data Dimension ==
+== Reducing the Data Dimension 数据降维 ==
 【原文】：We see that the principal direction of variation of the data is the first
@@ Line 241: / Line 245: @@
 \tilde{x}^{(i)} = x_{{\rm rot},1}^{(i)} = u_1^Tx^{(i)} \in \Re.
 \end{align}</math>
 【原文】：More generally, if <math>\textstyle x \in \Re^n</math> and we want to reduce it to
@@ Line 355: / Line 360: @@
-== Recovering an Approximation of the Data ==
+== Recovering an Approximation of the Data 数据还原 ==
 【原文】：Now, <math>\textstyle \tilde{x} \in \Re^k</math> is a lower-dimensional, "compressed" representation
@@ Line 375: / Line 381: @@
 = \sum_{i=1}^k u_i \tilde{x}_i.
 \end{align}</math>
 【原文】：The final equality above comes from the definition of <math>\textstyle U</math> [[#Example and Mathematical Background|given earlier]].
@@ Line 410: / Line 417: @@
-== Number of components to retain ==
+== Number of components to retain 选择主成分个数 ==
 【原文】：How do we set <math>\textstyle k</math>; i.e., how many PCA components should we retain?  In our
@@ Line 507: / Line 515: @@
 【二审】：
-== PCA on Images ==
+== PCA on Images 对图像数据应用PCA算法 ==
 【原文】：For PCA to work, usually we want each of the features <math>\textstyle x_1, x_2, \ldots, x_n</math>
 to have a similar range of values to the others (and to have a mean close to
@@ Line 617: / Line 628: @@
 【二审】：
-== References ==
+== References 参考文献 ==
 http://cs229.stanford.edu
 {{PCA}}