Exercise: Implement deep networks for digit classification

From Ufldl

Jump to: navigation, search
(Created page with "=== Step 0: Setup === You should build on your files from previous assignments. === Step 1: Implement the Stacked Autoencoder === Using the method described in the previous sect...")
(Initial commit)
Line 1: Line 1:
-
=== Step 0: Setup ===
+
==Stacked autoencoders for digit classification==
-
You should build on your files from previous assignments.
+
-
=== Step 1: Implement the Stacked Autoencoder ===
+
In this problem set, you will use a stacked autoencoder for digit classification. Th
-
Using the method described in the previous section, train the stacked autoencoder layer by layer using greedy layer-wise training.
+
 
 +
In the file <tt>stacked_ae_exercise.zip</tt>, we have provided some starter code. You will need to edit <tt>stackedAECost.m</tt>. You should also read <tt>stackedAETrain.m</tt> and ensure that you understand the steps.
 +
 
 +
=== Step 0: Initialize constants and parameters ===
 +
 
 +
Open <tt>stackedAETrain.m</tt>. In this step, we set meta-parameters to the same values that were used in previous exercise, which should produce reasonable results. You may to modify the meta-parameters if you wish.
 +
 
 +
=== Step 1: Train the data on the first autoencoder ===
 +
 
 +
Train the first autoencoder on the training images to obtain its parameters. This step is identical to the corresponding step in the sparse autoencoder and STL assignments, so if you have implemented your <tt>autoencoderCost.m</tt> correctly, this step should run properly without needing any modifications.  
=== Step 2: Train the data on the stacked autoencoder ===
=== Step 2: Train the data on the stacked autoencoder ===
-
Train the data found via blah tired on your stacked autoencoder. Training can take up to 20-30 minutes per layer, so you may wish to save your outputs to a separate file.
 
-
==== Step 2a: Visualize the data ====
 
-
Blah later
 
-
=== Step 3: Implement fine tuning ====
+
Run the training set through the first autoencoder to obtain hidden unit activation, then train this data on the second autoencoder. Since this is just an adapted application of a standard autoencoder, no changes to your coder should be required.
-
Sleepyyyy
+
 
 +
=== Step 3: Implement fine-tuning ====
 +
 
 +
To implement fine tuning, we need to consider all three layers as a single model. Implement <tt>stackedAECost.m</tt> to return the cost, gradient and predictions of the model. The cost function should be as defined as the log likelihood and a gradient decay term. The gradient should be computed using back-propogation as discussed earlier. The predictions should consist of the activations of the output layer of the softmax model.
 +
 
 +
To help you check that your implementation is correct, you can use the <tt>stackedAECheck.m</tt> script. The first part of the script script runs the same input on your combined-model function, and on your separate autoencoder and softmax functions, and checks that they return the same cost and predictions. The second part of the script checks that the numerical gradient of the function is the same as your computed analytic gradient. If these two checks pass, you will have implemented fine-tuning correctly.
 +
 
 +
'''Note:''' Recall that the cost function is given by:
 +
 
 +
<math>
 +
\begin{align}
 +
J(\theta) = -\ell(\theta) + w(\theta) \\
 +
w(\theta) = \frac{\lambda}{2} \sum_{i}{ \sum_{j}{ \theta_{ij}^2 } } \\
 +
\ell(\theta) = \theta^T_{y^{(i)}} x^{(i)} - \ln \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }}
 +
\end{align}
 +
</math>
 +
 
 +
When adding the weight decay term to the cost, only the weights for the topmost (softmax) layer need to be considered. Doing so does not impact the results adversely, but simplifies the implementation significantly.
=== Step 4: Cross-validation ===
=== Step 4: Cross-validation ===
-
Test on MNIST data, print out percentage, should be around 97%.
+
After completing these steps, running the entire script in stackedAETrain.m will perform layer-wise training of the stacked autoencoder, finetune the model, and measure its performance on the test set. If you've done all the steps correctly, you should get an accuracy of about X percent.

Revision as of 18:50, 21 April 2011

Personal tools