Linear Decoders

From Ufldl

Jump to: navigation, search
(Sparse Autoencoder Recap)
 
Line 1: Line 1:
-
【原文】:
 
== Sparse Autoencoder Recap ==
== Sparse Autoencoder Recap ==
-
 
-
【初译】:
 
-
 
-
稀疏自编码重述
 
-
 
-
【一校】:
 
-
 
-
稀疏自编码重述
 
-
 
-
【原文】:
 
In the sparse autoencoder, we had 3 layers of neurons: an input layer, a hidden layer and an output layer.  In our previous description
In the sparse autoencoder, we had 3 layers of neurons: an input layer, a hidden layer and an output layer.  In our previous description
Line 16: Line 5:
In these notes, we describe a modified version of the autoencoder in which some of the neurons use a different activation function.
In these notes, we describe a modified version of the autoencoder in which some of the neurons use a different activation function.
This will result in a model that is sometimes simpler to apply, and can also be more robust to variations in the parameters.  
This will result in a model that is sometimes simpler to apply, and can also be more robust to variations in the parameters.  
-
 
-
【初译】:
 
-
 
-
在稀疏自编码中,有三层:输入层,隐含层和输出层。在之前对自编码的定义(在神经网络中),位于神经网络中的每个神经元采用相同激励机制。在这些记录中,我们描述了一个修改版的自编码,其中一些神经元采用另外的激励机制。这产生一个更简易于应用,针对参数变化稳健性更佳的模型。
 
-
 
-
【一校】:
 
-
 
-
稀疏自编码器包含3层神经元,分别是输入层,隐含层以及输出层。
 
-
从前面(神经网络)自编码器描述可知,位于神经网络中的神经元都采用相同的激励函数。
 
-
在注解中,我们修改了自编码器定义,使得某些神经元采用不同的激励函数。这样得到的模型更容易应用,而且模型对参数的变化也更为鲁棒。
 
-
 
-
【原文】:
 
Recall that each neuron (in the output layer) computed the following:
Recall that each neuron (in the output layer) computed the following:
Line 39: Line 16:
where <math>a^{(3)}</math> is the output.  In the autoencoder, <math>a^{(3)}</math> is our approximate reconstruction of the input <math>x = a^{(1)}</math>.  
where <math>a^{(3)}</math> is the output.  In the autoencoder, <math>a^{(3)}</math> is our approximate reconstruction of the input <math>x = a^{(1)}</math>.  
-
 
-
【初译】:
 
-
 
-
每一个神经元(输出层)计算方式如下:
 
-
 
-
<math>
 
-
\begin{align}
 
-
z^{(3)} &= W^{(2)} a^{(2)} + b^{(2)} \\
 
-
a^{(3)} &= f(z^{(3)})
 
-
\end{align}
 
-
</math>
 
-
 
-
这里 <math>a^{(3)}</math> 是输出. 在自编码中, <math>a^{(3)}</math> 是对输入<math>x = a^{(1)}</math>的近似重建。
 
-
 
-
【一校】:
 
-
 
-
回想一下,输出层神经元计算公式如下:
 
-
 
-
<math>
 
-
\begin{align}
 
-
z^{(3)} &= W^{(2)} a^{(2)} + b^{(2)} \\
 
-
a^{(3)} &= f(z^{(3)})
 
-
\end{align}
 
-
</math>
 
-
 
-
其中 <math>a^{(3)}</math> 是输出. 在自编码器中, <math>a^{(3)}</math> 近似重构了输入<math>x = a^{(1)}</math>。
 
-
 
-
 
-
【原文】:
 
-
 
-
【初译】:
 
-
 
-
【一校】:
 
-
 
-
 
-
【原文】:
 
Because we used a sigmoid activation function for <math>f(z^{(3)})</math>, we needed to constrain or scale the inputs to be in the range <math>[0,1]</math>,  
Because we used a sigmoid activation function for <math>f(z^{(3)})</math>, we needed to constrain or scale the inputs to be in the range <math>[0,1]</math>,  
Line 80: Line 21:
While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is  
While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is  
no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range.
no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range.
 +
== Linear Decoder ==
== Linear Decoder ==
Line 119: Line 61:
Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the
Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the
derivative of the sigmoid (or tanh) function.
derivative of the sigmoid (or tanh) function.
 +
 +
 +
{{Languages|线性解码器|中文}}

Latest revision as of 04:06, 8 April 2013

Personal tools