Self-Taught Learning
From Ufldl
(→Learning features) |
|||
Line 44: | Line 44: | ||
(perhaps with appropriate whitening or other pre-processing): | (perhaps with appropriate whitening or other pre-processing): | ||
- | [[File:STL_SparseAE.png]] | + | [[File:STL_SparseAE.png|350px]] |
Having trained the parameters <math>\textstyle W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)}</math> of this model, | Having trained the parameters <math>\textstyle W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)}</math> of this model, | ||
Line 53: | Line 53: | ||
neural network: | neural network: | ||
- | [[File:STL_SparseAE_Features.png]] | + | [[File:STL_SparseAE_Features.png|300px]] |
This is just the sparse autoencoder that we previously had, with with the final | This is just the sparse autoencoder that we previously had, with with the final | ||
Line 73: | Line 73: | ||
\}</math> (if we use the replacement representation, and use <math>\textstyle a_l^{(i)}</math> to represent the | \}</math> (if we use the replacement representation, and use <math>\textstyle a_l^{(i)}</math> to represent the | ||
<math>\textstyle i</math>-th training example), or <math>\textstyle \{ | <math>\textstyle i</math>-th training example), or <math>\textstyle \{ | ||
- | ((x_l^{(1)}, a_l^{(1)}), y^{(1)}), ((x_l^{(2)}, a_l^{(1)}), y^{(2)}), \ldots | + | ((x_l^{(1)}, a_l^{(1)}), y^{(1)}), ((x_l^{(2)}, a_l^{(1)}), y^{(2)}), \ldots, |
((x_l^{(m_l)}, a_l^{(1)}), y^{(m_l)}) \}</math> (if we use the concatenated | ((x_l^{(m_l)}, a_l^{(1)}), y^{(m_l)}) \}</math> (if we use the concatenated | ||
representation). In practice, the concatenated representation often works | representation). In practice, the concatenated representation often works | ||
Line 91: | Line 91: | ||
various pre-processing parameters. For example, one may have computed | various pre-processing parameters. For example, one may have computed | ||
a mean value of the data and subtracted off this mean to perform mean normalization, | a mean value of the data and subtracted off this mean to perform mean normalization, | ||
- | or used PCA to compute a matrix <math>\textstyle U</math> to represent the data as <math>\textstyle U^Tx</math> (or PCA | + | or used PCA to compute a matrix <math>\textstyle U</math> to represent the data as <math>\textstyle U^Tx</math> (or used |
+ | PCA | ||
whitening or ZCA whitening). If this is the case, then it is important to | whitening or ZCA whitening). If this is the case, then it is important to | ||
save away these preprocessing parameters, and to use the ''same'' parameters | save away these preprocessing parameters, and to use the ''same'' parameters | ||
Line 102: | Line 103: | ||
labeled training set, since that might result in a dramatically different | labeled training set, since that might result in a dramatically different | ||
pre-processing transformation, which would make the input distribution to | pre-processing transformation, which would make the input distribution to | ||
- | the autoencoder very different from what it was actually trained on. | + | the autoencoder very different from what it was actually trained on. |
== On the terminology of unsupervised feature learning == | == On the terminology of unsupervised feature learning == | ||
There are two common unsupervised feature learning settings, depending on what type of | There are two common unsupervised feature learning settings, depending on what type of | ||
- | unlabeled data you have. The more powerful setting is the '''self-taught learning''' | + | unlabeled data you have. The more general and powerful setting is the '''self-taught learning''' |
setting, which does not assume that your unlabeled data <math>x_u</math> has to | setting, which does not assume that your unlabeled data <math>x_u</math> has to | ||
be drawn from the same distribution as your labeled data <math>x_l</math>. The | be drawn from the same distribution as your labeled data <math>x_l</math>. The | ||
Line 130: | Line 131: | ||
ones are motorcycles), then we could use this form of unlabeled data to | ones are motorcycles), then we could use this form of unlabeled data to | ||
learn the features. This setting---where each unlabeled example is drawn from the same | learn the features. This setting---where each unlabeled example is drawn from the same | ||
- | distribution as your labeled examples---is sometimes called the | + | distribution as your labeled examples---is sometimes called the semi-supervised |
- | setting. In practice, we | + | setting. In practice, we often do not have this sort of unlabeled data (where would you |
get a database of images where every image is either a car or a motorcycle, but | get a database of images where every image is either a car or a motorcycle, but | ||
just missing its label?), and so in the context of learning features from unlabeled | just missing its label?), and so in the context of learning features from unlabeled | ||
- | data, the self-taught learning setting is | + | data, the self-taught learning setting is more broadly applicable. |
+ | |||
+ | |||
+ | {{STL}} | ||
+ | |||
+ | |||
+ | {{Languages|自我学习|中文}} |