主成分分析
From Ufldl
m |
|||
Line 25: | Line 25: | ||
- | + | 【原文】:Suppose you are training your algorithm on images. Then the input will be | |
somewhat redundant, because the values of adjacent pixels in an image are | somewhat redundant, because the values of adjacent pixels in an image are | ||
highly correlated. Concretely, suppose we are training on 16x16 grayscale | highly correlated. Concretely, suppose we are training on 16x16 grayscale | ||
Line 32: | Line 32: | ||
correlation between adjacent pixels, PCA will allow us to approximate the input with | correlation between adjacent pixels, PCA will allow us to approximate the input with | ||
a much lower dimensional one, while incurring very little error. | a much lower dimensional one, while incurring very little error. | ||
+ | |||
+ | 【初译】:假如通过图像输入来训练算法。那么输入数据是有一定冗余的,因为图像中相连的像素是强相关的。具体来说,假如我们正在训练16×16的灰度值图像,那么<math>\textstyle x \in \Re^{256}</math>的数据便是256维的向量,特征值<math>\textstyle x_j</math>表示每个像素的强度值。因为相连像素间具有相关性,PCA算法可以在保证产生很小的误差值的情况下近似的把输入数据投影到更低维的空间里。。 | ||
+ | |||
+ | 【一审】:假设你使用图像来训练算法。那么输入数据是有一定冗余的,因为图像中相连的像素是强相关的。具体来说,假如我们正在训练16×16的灰度值图像,那么 <math>\textstyle x \in \Re^{256}</math>的数据便是256维的向量,特征值<math>\textstyle x_j</math>表示每个像素的强度值。因为相连像素间具有相关性,PCA算法可以在保证产生很小的误差值的情况下近似的把输入数据投影到更低维的空间里。 | ||
+ | |||
+ | 【二审】:假设你使用图像来训练算法,那么输入数据是有一定冗余的,因为图像中相连的像素是强相关的。具体来说,假如我们正在训练的16x16灰度值图像,那么<math>\textstyle x \in \Re^{256}</math>是个256维向量,其中,特征值<math>\textstyle x_j</math>对应每个像素的亮度值。因为相连像素间具有相关性,PCA算法可以将输入向量转换为一个维数低很多的近似向量,并仅产生极小的误差。 | ||
+ | |||
== Example and Mathematical Background == | == Example and Mathematical Background == |