Logistic Regression Vectorization Example
From Ufldl
for
Logistic Regression Vectorization Example
Jump to:
navigation
,
search
Consider training a logistic regression model using batch gradient ascent. Suppose our hypothesis is :<math>\begin{align} h_\theta(x) = \frac{1}{1+\exp(-\theta^Tx)}, \end{align}</math> where (following CS229 notational convention) we let <math>\textstyle x_0=1</math>, so that <math>\textstyle x \in \Re^{n+1}</math> and <math>\textstyle \theta \in \Re^{n+1}</math>, and <math>\textstyle \theta_0</math> is our intercept term. We have a training set <math>\textstyle \{(x^{(1)}, y^{(1)}), \ldots, (x^{(m)}, y^{(m)})\}</math> of <math>\textstyle m</math> examples, and the batch gradient ascent update rule is "<math>\textstyle \theta := \theta + \alpha \nabla_\theta \ell(\theta)</math>, where <math>\textstyle \ell(\theta)</math> is the log likelihood and <math>\textstyle \nabla_\theta \ell(\theta)</math> is its derivative. [Note: Most of the notation below follows that defined in the class CS229: Machine Learning. Please see Lecture notes #1 from http://cs229.stanford.edu/ for details.] We thus need to compute the gradient: :<math>\begin{align} \nabla_\theta \ell(\theta) = \sum_{i=1}^m \left(y^{(i)} - h_\theta(x^{(i)}) \right) x^{(i)}_j. \end{align}</math> Suppose that the Matlab/Octave variable <tt>x</tt> is the design matrix, so that <tt>x(i,:)'</tt> is the <math>\textstyle i</math>-th training example <math>\textstyle x^{(i)}</math> and <tt>x(i,j)</tt> is <math>\textstyle x^{(i)}_j</math>. Further, suppose the Matlab/Octave variable <tt>y</tt> is a vector of the labels in the training set, so that <tt>y(i)</tt> is <math>\textstyle y^{(i)} \in \{0,1\}</math>. Here's truly horrible, extremely slow, implementation: <syntaxhighlight lang="matlab"> grad = zeros(n+1,1); for i=1:m, h = sigmoid(theta'*x(i,:)'); temp = y(i) - h; for j=1:n+1, grad(j) = grad(j) + temp * x(i,j); end; end; </syntaxhighlight> The two nested for-loops makes this very slow. Here's a more typical implementation, that partially vectorizes the algorithm and gets better performance: <syntaxhighlight lang="matlab"> grad = zeros(n+1,1); for i=1:m, grad = grad + (y(i) - sigmoid(theta'*x(i,:)'))* x(i,:)'; end; </syntaxhighlight> However, it turns out to be possible to even further vectorize this. In Matlab/Octave, it is possible to get rid of for-loops, and doing so will speed up the algorithm. In particular, we can implement the following: <syntaxhighlight lang="matlab"> grad = X' * (y- sigmoid(X*theta)) </syntaxhighlight>
Template:Languages
(
view source
)
Template:Vectorized Implementation
(
view source
)
Return to
Logistic Regression Vectorization Example
.
Views
Page
Discussion
View source
History
Personal tools
Log in
ufldl resources
UFLDL Tutorial
Recommended Readings
wiki
Main page
Recent changes
Random page
Help
Search
Toolbox
What links here
Related changes
Special pages