自编码算法与稀疏性
From Ufldl
Line 30: | Line 30: | ||
[[Image:Autoencoder636.png|400px|center]] | [[Image:Autoencoder636.png|400px|center]] | ||
+ | |||
+ | 【原文】 | ||
+ | |||
+ | The autoencoder tries to learn a function <math>\textstyle h_{W,b}(x) \approx x</math>. In other | ||
+ | words, it is trying to learn an approximation to the identity function, so as | ||
+ | to output <math>\textstyle \hat{x}</math> that is similar to <math>\textstyle x</math>. The identity function seems a | ||
+ | particularly trivial function to be trying to learn; but by placing constraints | ||
+ | on the network, such as by limiting the number of hidden units, we can discover | ||
+ | interesting structure about the data. As a concrete example, suppose the | ||
+ | inputs <math>\textstyle x</math> are the pixel intensity values from a <math>\textstyle 10 \times 10</math> image (100 | ||
+ | pixels) so <math>\textstyle n=100</math>, and there are <math>\textstyle s_2=50</math> hidden units in layer <math>\textstyle L_2</math>. Note that | ||
+ | we also have <math>\textstyle y \in \Re^{100}</math>. Since there are only 50 hidden units, the | ||
+ | network is forced to learn a ''compressed'' representation of the input. | ||
+ | I.e., given only the vector of hidden unit activations <math>\textstyle a^{(2)} \in \Re^{50}</math>, | ||
+ | it must try to '''reconstruct''' the 100-pixel input <math>\textstyle x</math>. If the input were completely | ||
+ | random---say, each <math>\textstyle x_i</math> comes from an IID Gaussian independent of the other | ||
+ | features---then this compression task would be very difficult. But if there is | ||
+ | structure in the data, for example, if some of the input features are correlated, | ||
+ | then this algorithm will be able to discover some of those correlations. In fact, | ||
+ | this simple autoencoder often ends up learning a low-dimensional representation very similar | ||
+ | to PCAs. | ||
+ | |||
+ | 【初译】 | ||
+ | |||
+ | 自编码神经网络尝试学习一个 <math>\textstyle h_{W,b}(x) \approx x</math> 的函数。换句话说,它尝试逼近一个单位函数,从而使得输出 <math>\textstyle \hat{x}</math> 接近于输入 <math>\textstyle x</math> 。单位函数虽然看起来非常容易学习,但是当我们为自编码神经网络加入某些限制,比如限定隐藏神经元的数量,我们就可以从输入数据中发现一些有趣的结构。举例来说,假设某个自编码神经网络的输入 <math>\textstyle x</math> 是一张 <math>\textstyle 10 \times 10</math> 图像的像素值,于是 <math>\textstyle 10 \times 10</math> ,其隐层 <math>\textstyle L_2</math> 中有 <math>\textstyle s_2=50</math> 个隐藏神经元 。注意,输出是100维的 。由于只有50个隐藏神经元,我们迫使自编码神经网络去学习输入数据的压缩表示,因为它需要从50维的隐藏神经元激活度向量 中重构出100维的像素值输入 。如果网络的输入数据是完全随机的,比如每一个输入 都是一个跟其它特征完全无关的独立同分布高斯随机变量,那么这一压缩表示将会非常难学习。但是如果输入数据中隐含着一些特定的结构,比如某些输入特征是相关的,那么这一算法就可以发现输入数据中的这些相关性。事实上,这一简单的自编码神经网络通常可以学习出一个跟主元分析(PCA)结果非常相似的输入数据的低维表示。 |