Exercise:Softmax Regression

From Ufldl

Jump to: navigation, search
(Step 2: Implement softmaxCost)
(Softmax regression)
Line 1: Line 1:
== Softmax regression ==
== Softmax regression ==
-
In this problem set, you will use [[softmax regression]] on pixels to classify MNIST images. However, since you will also be using softmax regression for the [[Self-Taught Learning]] exercise later, your implementation should be a more general implementation that works on any arbitrary input.
+
In this problem set, you will use [[softmax regression]] to classify MNIST images. The goal of this exercise is to build a softmax classifier that you will be able to reuse in the future exercises and also on other classification problems that you might encounter.
-
In the file <tt>[http://ufldl.stanford.edu/wiki/resources/softmax_exercise.zip softmax_exercise.zip]</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files. You will need to modify <tt>softmaxCost.m</tt> and <tt>softmaxPredict.m</tt> for this exercise.
+
In the file <tt>[http://ufldl.stanford.edu/wiki/resources/softmax_exercise.zip softmax_exercise.zip]</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files.  
 +
 
 +
You will need to modify '''<tt>softmaxCost.m</tt>''' and '''<tt>softmaxPredict.m</tt>''' for this exercise.
=== Support Code/Data ===
=== Support Code/Data ===
Line 18: Line 20:
=== Step 1: Load data ===
=== Step 1: Load data ===
-
The starter code loads the MNIST images and labels into inputData and outputData respectively. The images are pre-processed to scale the pixel values to the range <math>[0, 1]</math>, and the label 0 is remapped to 10 for convenience of implementation. You will not need to change any code in this step for this exercise, but note that your code should be general enough to operate on data of arbitrary size belonging to any number of classes.
+
The starter code loads the MNIST images and labels into <tt>inputData</tt> and <tt>labels</tt> respectively. The images are pre-processed to scale the pixel values to the range <math>[0, 1]</math>, and the label 0 is remapped to 10 for convenience of implementation. You will not need to change any code in this step for this exercise, but note that your code should be general enough to operate on data of arbitrary size belonging to any number of classes.
=== Step 2: Implement softmaxCost ===
=== Step 2: Implement softmaxCost ===
-
In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc minimises this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximise <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).
+
In <tt>softmaxCost.m</tt>, implement code to compute the softmax cost function. Since minFunc minimizes this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximize <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).  
 +
 
 +
 
 +
It is important to vectorize your code so that it runs quickly. We also provide several implementation tips below:

Revision as of 02:05, 7 May 2011

Personal tools