主成分分析
From Ufldl
m |
m |
||
Line 1: | Line 1: | ||
- | 初译:@交大基层代表 @Emma_lzhang | + | 初译:@交大基层代表 @Emma_lzhang (新浪微博ID) |
一审:@Dr金峰 | 一审:@Dr金峰 | ||
Line 5: | Line 5: | ||
二审:@破破的桥 | 二审:@破破的桥 | ||
- | 录入:@Emma_lzhang | + | 录入:@Emma_lzhang 邮箱:emma.lzhang@gmail.com |
Line 58: | Line 58: | ||
【二审】:在我们的实例中,使用的输入数据集表示为<math>\textstyle \{x^{(1)}, x^{(2)}, \ldots, x^{(m)}\}</math>,维度<math>\textstyle n=2</math>,即<math>\textstyle x^{(i)} \in \Re^2</math>。假设我们想把数据从2维降到1维。(在实际应用中,我们往往需要把数据从256维降到50维;但在我们的示例中使用低维数据,可更直观的展现算法),下图是我们的数据集: | 【二审】:在我们的实例中,使用的输入数据集表示为<math>\textstyle \{x^{(1)}, x^{(2)}, \ldots, x^{(m)}\}</math>,维度<math>\textstyle n=2</math>,即<math>\textstyle x^{(i)} \in \Re^2</math>。假设我们想把数据从2维降到1维。(在实际应用中,我们往往需要把数据从256维降到50维;但在我们的示例中使用低维数据,可更直观的展现算法),下图是我们的数据集: | ||
- | |||
[[File:PCA-rawdata.png|600px]] | [[File:PCA-rawdata.png|600px]] | ||
+ | 【原文】: | ||
This data has already been pre-processed so that each of the features <math>\textstyle x_1</math> and <math>\textstyle x_2</math> | This data has already been pre-processed so that each of the features <math>\textstyle x_1</math> and <math>\textstyle x_2</math> | ||
have about the same mean (zero) and variance. | have about the same mean (zero) and variance. | ||
Line 69: | Line 69: | ||
algorithm, and are for illustration only. | algorithm, and are for illustration only. | ||
- | + | 【初译】:数据已经进行了预处理,所以特征值<math>\textstyle x_1</math>和<math>\textstyle x_2</math>有相近的平均值(零)和方差。 | |
+ | 为了更直观的区分,根据<math>\textstyle x_1</math>值的大小我们把它们分成了三种颜色。颜色的区分对于算法没有任何影响,仅仅是为了直观表示。 | ||
+ | |||
+ | 【一审】:数据已经进行了预处理,所以特征值<math>\textstyle x_1</math>和<math>\textstyle x_2</math>有相近的平均值(零)和方差。 | ||
+ | 为了更直观的区分,根据<math>\textstyle x_1</math>值的大小我们把它们分成了三种颜色。颜色的区分对于算法没有任何影响,仅仅是为了直观表示。 | ||
+ | |||
+ | |||
+ | 【二审】:这些数据已经进行了预处理,所以每个特征值<math>\textstyle x_1</math>和<math>\textstyle x_2</math>有相同的平均值(零)和方差。 | ||
+ | 为方便展示,根据<math>\textstyle x_1</math>值的大小,我们将每个点分别涂上了三种颜色的一种。该颜色并不用于算法而仅用于图解。 | ||
+ | |||
+ | |||
+ | |||
+ | 【原文】:PCA will find a lower-dimensional subspace onto which to project our data. | ||
From visually examining the data, it appears that <math>\textstyle u_1</math> is the principal direction of | From visually examining the data, it appears that <math>\textstyle u_1</math> is the principal direction of | ||
variation of the data, and <math>\textstyle u_2</math> the secondary direction of variation: | variation of the data, and <math>\textstyle u_2</math> the secondary direction of variation: |