Exercise:Softmax Regression

From Ufldl

Jump to: navigation, search
(Softmax regression)
(Step 2: Implement softmaxCost)
Line 31: Line 31:
=== Step 2: Implement softmaxCost ===
=== Step 2: Implement softmaxCost ===
-
In <tt>softmaxCost.m</tt>, implement code to compute the softmax cost function. Since minFunc minimizes this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximize <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).  
+
In <tt>softmaxCost.m</tt>, implement code to compute the softmax cost function <math>J(\theta)</math>. Remember to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).  
It is important to vectorize your code so that it runs quickly. We also provide several implementation tips below:
It is important to vectorize your code so that it runs quickly. We also provide several implementation tips below:
Line 47: Line 47:
\begin{align}  
\begin{align}  
h(x^{(i)}) =  
h(x^{(i)}) =  
-
\frac{1}{ \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }} }
+
\frac{1}{ \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} }} }
\begin{bmatrix}  
\begin{bmatrix}  
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
\vdots \\
\vdots \\
-
e^{ \theta_n^T x^{(i)} } \\
+
e^{ \theta_k^T x^{(i)} } \\
\end{bmatrix}
\end{bmatrix}
\end{align}
\end{align}
Line 63: Line 63:
h(x^{(i)}) &=
h(x^{(i)}) &=
   
   
-
\frac{1}{ \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }} }
+
\frac{1}{ \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} }} }
\begin{bmatrix}  
\begin{bmatrix}  
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
\vdots \\
\vdots \\
-
e^{ \theta_n^T x^{(i)} } \\
+
e^{ \theta_k^T x^{(i)} } \\
\end{bmatrix} \\
\end{bmatrix} \\
&=
&=
-
\frac{ e^{-\alpha} }{ e^{-\alpha} \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }} }
+
\frac{ e^{-\alpha} }{ e^{-\alpha} \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} }} }
\begin{bmatrix}  
\begin{bmatrix}  
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_1^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
e^{ \theta_2^T x^{(i)} } \\
\vdots \\
\vdots \\
-
e^{ \theta_n^T x^{(i)} } \\
+
e^{ \theta_k^T x^{(i)} } \\
\end{bmatrix} \\
\end{bmatrix} \\
&=
&=
-
\frac{ 1 }{ \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} - \alpha }} }
+
\frac{ 1 }{ \sum_{j=1}^{k}{e^{ \theta_j^T x^{(i)} - \alpha }} }
\begin{bmatrix}  
\begin{bmatrix}  
e^{ \theta_1^T x^{(i)} - \alpha } \\
e^{ \theta_1^T x^{(i)} - \alpha } \\
e^{ \theta_2^T x^{(i)} - \alpha } \\
e^{ \theta_2^T x^{(i)} - \alpha } \\
\vdots \\
\vdots \\
-
e^{ \theta_n^T x^{(i)} - \alpha } \\
+
e^{ \theta_k^T x^{(i)} - \alpha } \\
\end{bmatrix} \\
\end{bmatrix} \\

Revision as of 06:59, 10 May 2011

Personal tools