Exercise:Softmax Regression

Revision as of 00:44, 3 May 2011 (view source)

Jngiam (Talk | contribs)

(→Step 5: Testing)

← Older edit

Revision as of 04:32, 4 May 2011 (view source)

Jngiam (Talk | contribs)

(→Step 2: Implement softmaxCost)

Newer edit →

Line 17:

In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc minimises this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximise <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).

-

'''Implementation tip: computing the ground truth matrix''' - in your code, you may need to compute the ground truth matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the ground truth matrix has already been included in softmaxCost.m.

-

'''Implementation ~~tip~~: ~~preventing overflows~~''' - in softmax regression, you will have to compute the hypothesis

+

''Implementation Tip'': Computing the ground truth matrix - In your code, you may need to compute the ground truth matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the ground truth matrix has already been included in softmaxCost.m.

+

''Implementation Tip:'' Preventing overflows - in softmax regression, you will have to compute the hypothesis

<math>

Line 85:

Line 87:

M = bsxfun(@rdivide, M, sum(M))

-

The operation of <tt>bsxfun</tt> in this case is analogous to the earlier example.

+

The operation of <tt>bsxfun</tt> in this case is analogous to the earlier example.

=== Step 3: Gradient checking ===

Exercise:Softmax Regression

From Ufldl

Revision as of 04:32, 4 May 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 17: / Line 17: @@
 In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc minimises this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximise <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).
-'''Implementation tip: computing the ground truth matrix''' - in your code, you may need to compute the ground truth matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the ground truth matrix has already been included in softmaxCost.m.
-'''Implementation tip: preventing overflows''' - in softmax regression, you will have to compute the hypothesis
+''Implementation Tip'': Computing the ground truth matrix - In your code, you may need to compute the ground truth matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the ground truth matrix has already been included in softmaxCost.m.
+''Implementation Tip:'' Preventing overflows - in softmax regression, you will have to compute the hypothesis
 <math>
@@ Line 85: / Line 87: @@
   M = bsxfun(@rdivide, M, sum(M))
 The operation of <tt>bsxfun</tt> in this case is analogous to the earlier example.
 === Step 3: Gradient checking ===