Exercise:Softmax Regression

From Ufldl

Jump to: navigation, search
(Step 2: Implement softmaxCost)
 
Line 22: Line 22:
=== Step 0: Initialize constants and parameters ===
=== Step 0: Initialize constants and parameters ===
 +
 +
We've provided the code for this step in <tt>softmaxExercise.m</tt>.
Two constants, <tt>inputSize</tt> and <tt>numClasses</tt>, corresponding to the size of each input vector and the number of class labels have been defined in the starter code. This will allow you to reuse your code on a different data set in a later exercise. We also initialize <tt>lambda</tt>, the weight decay parameter here.
Two constants, <tt>inputSize</tt> and <tt>numClasses</tt>, corresponding to the size of each input vector and the number of class labels have been defined in the starter code. This will allow you to reuse your code on a different data set in a later exercise. We also initialize <tt>lambda</tt>, the weight decay parameter here.
Line 103: Line 105:
<tt>max(M)</tt> yields a row vector with each element giving the maximum value in that column. <tt>bsxfun</tt> (short for binary singleton expansion function) applies minus along each row of <tt>M</tt>, hence subtracting the maximum of each column from every element in the column.  
<tt>max(M)</tt> yields a row vector with each element giving the maximum value in that column. <tt>bsxfun</tt> (short for binary singleton expansion function) applies minus along each row of <tt>M</tt>, hence subtracting the maximum of each column from every element in the column.  
-
'''Implementation Tip: ''' Computing the predictions - you may also find <tt>bsxfun</tt> useful in computing your predictions - if you have a matrix <tt>M</tt> containing the <math>e^{\theta_j^T x^{(i)}}</math> terms, such that <tt>M(r, c)</tt> contains the <math>e^{\theta_r^T x^{(c)}}</math> term, you can use the following code to compute the hypothesis (by diving all elements in each column by their column sum):
+
'''Implementation Tip: ''' Computing the predictions - you may also find <tt>bsxfun</tt> useful in computing your predictions - if you have a matrix <tt>M</tt> containing the <math>e^{\theta_j^T x^{(i)}}</math> terms, such that <tt>M(r, c)</tt> contains the <math>e^{\theta_r^T x^{(c)}}</math> term, you can use the following code to compute the hypothesis (by dividing all elements in each column by their column sum):
  % M is the matrix as described in the text
  % M is the matrix as described in the text
Line 124: Line 126:
Use the following parameter when training your softmax classifier:
Use the following parameter when training your softmax classifier:
-
  lambda = 1e-3
+
  lambda = 1e-4
=== Step 5: Testing ===
=== Step 5: Testing ===
Line 130: Line 132:
Now that you've trained your model, you will test it against the MNIST test set, comprising 10000 28x28 images. However, to do so, you will first need to complete the function <tt>softmaxPredict</tt> in <tt>softmaxPredict.m</tt>, a function which generates predictions for input data under a trained softmax model.  
Now that you've trained your model, you will test it against the MNIST test set, comprising 10000 28x28 images. However, to do so, you will first need to complete the function <tt>softmaxPredict</tt> in <tt>softmaxPredict.m</tt>, a function which generates predictions for input data under a trained softmax model.  
-
Once that is done, you will be able to compute the accuracy (the proportion of correctly classified images) of your model using the code provided. Our implementation achieved an accuracy of '''92%'''. If your model's accuracy is significantly less (less than 91%), check your code, ensure that you are using the trained weights, and that you are training your model on the full 60000 training images. Conversely, if your accuracy is too high (99-100%), ensure that you have not accidentally trained your model on the test set as well.
+
Once that is done, you will be able to compute the accuracy (the proportion of correctly classified images) of your model using the code provided. Our implementation achieved an accuracy of '''92.6%'''. If your model's accuracy is significantly less (less than 91%), check your code, ensure that you are using the trained weights, and that you are training your model on the full 60000 training images. Conversely, if your accuracy is too high (99-100%), ensure that you have not accidentally trained your model on the test set as well.
[[Category:Exercises]]
[[Category:Exercises]]
 +
 +
 +
{{Softmax}}

Latest revision as of 11:02, 26 May 2011

Personal tools