Autoencoders and Sparsity
From Ufldl
m (Cleaned up quotes) |
|||
Line 1: | Line 1: | ||
- | So far, we have described the application of neural networks to supervised learning, in which we | + | So far, we have described the application of neural networks to supervised learning, in which we have labeled |
- | training examples. Now suppose we have only unlabeled training examples | + | training examples. Now suppose we have only a set of unlabeled training examples <math>\textstyle \{x^{(1)}, x^{(2)}, x^{(3)}, \ldots\}</math>, |
where <math>\textstyle x^{(i)} \in \Re^{n}</math>. An | where <math>\textstyle x^{(i)} \in \Re^{n}</math>. An | ||
'''autoencoder''' neural network is an unsupervised learning algorithm that applies backpropagation, | '''autoencoder''' neural network is an unsupervised learning algorithm that applies backpropagation, | ||
Line 24: | Line 24: | ||
features---then this compression task would be very difficult. But if there is | features---then this compression task would be very difficult. But if there is | ||
structure in the data, for example, if some of the input features are correlated, | structure in the data, for example, if some of the input features are correlated, | ||
- | then this algorithm will be able to discover some of those correlations. | + | then this algorithm will be able to discover some of those correlations. In fact, |
this simple autoencoder often ends up learning a low-dimensional representation very similar | this simple autoencoder often ends up learning a low-dimensional representation very similar | ||
- | to | + | to PCAs. |
Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small. But | Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small. But | ||
Line 39: | Line 39: | ||
its output value is close to 1, or as being "inactive" if its output value is | its output value is close to 1, or as being "inactive" if its output value is | ||
close to 0. We would like to constrain the neurons to be inactive most of the | close to 0. We would like to constrain the neurons to be inactive most of the | ||
- | time. | + | time. This discussion assumes a sigmoid activation function. If you are |
using a tanh activation function, then we think of a neuron as being inactive | using a tanh activation function, then we think of a neuron as being inactive | ||
- | when it outputs values close to -1. | + | when it outputs values close to -1. |
Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the | Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the | ||
Line 135: | Line 135: | ||
<math>\textstyle J_{\rm sparse}(W,b)</math>. Using the derivative checking method, you will be able to verify | <math>\textstyle J_{\rm sparse}(W,b)</math>. Using the derivative checking method, you will be able to verify | ||
this for yourself as well. | this for yourself as well. | ||
+ | |||
+ | |||
+ | {{Sparse_Autoencoder}} | ||
+ | |||
+ | |||
+ | {{Languages|自编码算法与稀疏性|中文}} |