Exercise:Softmax Regression
From Ufldl
(→Step 2: Implement softmaxCost) |
(→Softmax regression) |
||
Line 1: | Line 1: | ||
== Softmax regression == | == Softmax regression == | ||
- | In this problem set, you will use [[softmax regression]] | + | In this problem set, you will use [[softmax regression]] to classify MNIST images. The goal of this exercise is to build a softmax classifier that you will be able to reuse in the future exercises and also on other classification problems that you might encounter. |
- | In the file <tt>[http://ufldl.stanford.edu/wiki/resources/softmax_exercise.zip softmax_exercise.zip]</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files. You will need to modify <tt>softmaxCost.m</tt> and <tt>softmaxPredict.m</tt> for this exercise. | + | In the file <tt>[http://ufldl.stanford.edu/wiki/resources/softmax_exercise.zip softmax_exercise.zip]</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files. |
+ | |||
+ | You will need to modify '''<tt>softmaxCost.m</tt>''' and '''<tt>softmaxPredict.m</tt>''' for this exercise. | ||
=== Support Code/Data === | === Support Code/Data === | ||
Line 18: | Line 20: | ||
=== Step 1: Load data === | === Step 1: Load data === | ||
- | The starter code loads the MNIST images and labels into inputData and | + | The starter code loads the MNIST images and labels into <tt>inputData</tt> and <tt>labels</tt> respectively. The images are pre-processed to scale the pixel values to the range <math>[0, 1]</math>, and the label 0 is remapped to 10 for convenience of implementation. You will not need to change any code in this step for this exercise, but note that your code should be general enough to operate on data of arbitrary size belonging to any number of classes. |
=== Step 2: Implement softmaxCost === | === Step 2: Implement softmaxCost === | ||
- | In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc | + | In <tt>softmaxCost.m</tt>, implement code to compute the softmax cost function. Since minFunc minimizes this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximize <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later). |
+ | |||
+ | |||
+ | It is important to vectorize your code so that it runs quickly. We also provide several implementation tips below: | ||