Autoencoders and Sparsity

From Ufldl

Jump to: navigation, search
m (Cleaned up quotes)
 
Line 1: Line 1:
-
So far, we have described the application of neural networks to supervised learning, in which we are have labeled
+
So far, we have described the application of neural networks to supervised learning, in which we have labeled
-
training examples.  Now suppose we have only unlabeled training examples set <math>\textstyle \{x^{(1)}, x^{(2)}, x^{(3)}, \ldots\}</math>,
+
training examples.  Now suppose we have only a set of unlabeled training examples <math>\textstyle \{x^{(1)}, x^{(2)}, x^{(3)}, \ldots\}</math>,
where <math>\textstyle x^{(i)} \in \Re^{n}</math>.  An
where <math>\textstyle x^{(i)} \in \Re^{n}</math>.  An
'''autoencoder''' neural network is an unsupervised learning algorithm that applies backpropagation,
'''autoencoder''' neural network is an unsupervised learning algorithm that applies backpropagation,
Line 24: Line 24:
features---then this compression task would be very difficult.  But if there is
features---then this compression task would be very difficult.  But if there is
structure in the data, for example, if some of the input features are correlated,
structure in the data, for example, if some of the input features are correlated,
-
then this algorithm will be able to discover some of those correlations.\footnote{In fact,
+
then this algorithm will be able to discover some of those correlations. In fact,
this simple autoencoder often ends up learning a low-dimensional representation very similar
this simple autoencoder often ends up learning a low-dimensional representation very similar
-
to PCA's.}
+
to PCAs.
Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small.  But
Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small.  But
Line 39: Line 39:
its output value is close to 1, or as being "inactive" if its output value is
its output value is close to 1, or as being "inactive" if its output value is
close to 0.  We would like to constrain the neurons to be inactive most of the
close to 0.  We would like to constrain the neurons to be inactive most of the
-
time.\footnote{This discussion assumes a sigmoid activation function.  If you are
+
time. This discussion assumes a sigmoid activation function.  If you are
using a tanh activation function, then we think of a neuron as being inactive
using a tanh activation function, then we think of a neuron as being inactive
-
when it outputs values close to -1.}
+
when it outputs values close to -1.
Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the
Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the
Line 135: Line 135:
<math>\textstyle J_{\rm sparse}(W,b)</math>.  Using the derivative checking method, you will be able to verify
<math>\textstyle J_{\rm sparse}(W,b)</math>.  Using the derivative checking method, you will be able to verify
this for yourself as well.
this for yourself as well.
 +
 +
 +
{{Sparse_Autoencoder}}
 +
 +
 +
{{Languages|自编码算法与稀疏性|中文}}

Latest revision as of 12:43, 7 April 2013

Personal tools