Exercise:Self-Taught Learning

From Ufldl

Jump to: navigation, search
(Overview)
(Overview)
Line 5: Line 5:
You will be building upon your code from the earlier exercises. First, you will train your sparse autoencoder on an "unlabeled" training dataset of handwritten digits. This produces feature that are penstroke-like. We then extract these learned features from a labeled dataset of handwritten digits. These features will then be used as inputs to the softmax classifier that you wrote in the previous exercise.  
You will be building upon your code from the earlier exercises. First, you will train your sparse autoencoder on an "unlabeled" training dataset of handwritten digits. This produces feature that are penstroke-like. We then extract these learned features from a labeled dataset of handwritten digits. These features will then be used as inputs to the softmax classifier that you wrote in the previous exercise.  
-
Concretely, for each example in the the labeled training dataset <math>\textstyle x^{(k)}</math>, we forward propagate the example to obtain the activation of the hidden units <math>\textstyle a^{(2)}</math>. The data is now represented in term of <math>\textstyle a^{(2)}</math> used to train the softmax classifier.  
+
Concretely, for each example in the the labeled training dataset <math>\textstyle x_l</math>, we forward propagate the example to obtain the activation of the hidden units <math>\textstyle a^{(2)}</math>. We now represent this example using <math>\textstyle a^{(2)}</math> (the "replacement" representation), and use this to as the new feature representation with which to train the softmax classifier.  
-
Finally, we also extract the same features from the test dataset to obtain predictions.
+
Finally, we also extract the same features from the test data to obtain predictions.
-
We will use the  the digits 5 to 9 as an "unlabeled" dataset. while the digits 0 to 4 are used as the labeled training set.
+
In this exercise, our goal is to distinguish between the digits from 0 to 4.  We will use the digits 5 to 9 as our
 +
"unlabeled" dataset which which to learn the features; we will then use a labeled dataset with the digits 0 to 4 with
 +
which to train the softmax classifier.  
-
In the starter code, we have also provided '''<tt>stlExercise.m</tt>''' that will help walk you through the steps in this exercise.
+
In the starter code, we have provided a file '''<tt>stlExercise.m</tt>''' that will help walk you through the steps in this exercise.
=== Dependencies ===
=== Dependencies ===

Revision as of 23:36, 10 May 2011

Personal tools