Softmax Regression

Revision as of 07:22, 4 May 2011 (view source)

Jngiam (Talk | contribs)

(→Introduction)

← Older edit

Revision as of 07:22, 4 May 2011 (view source)

Jngiam (Talk | contribs)

(→Weight Regularization)

Newer edit →

Line 82:

=== Weight Regularization ===

-

When using softmax regression in practice, it is important to use weight regularization. In particular, if there ~~exist~~ a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.

+

When using softmax regression in practice, it is important to use weight regularization. In particular, if there exists a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.

Weight regularization is also important as it often results in models that generalize better. In particular, one can view weight regularization as placing a (Gaussian) prior on <math>\theta</math> so as to prefer <math>\theta</math> with smaller values.

Line 112:

Minimizing <math>J(\theta)</math> now performs regularized softmax regression.

-

== Parameterization ==

Softmax Regression

From Ufldl

Revision as of 07:22, 4 May 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 82: / Line 82: @@
 === Weight Regularization ===
-When using softmax regression in practice, it is important to use weight regularization. In particular, if there exist a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.
+When using softmax regression in practice, it is important to use weight regularization. In particular, if there exists a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.
 Weight regularization is also important as it often results in models that generalize better. In particular, one can view weight regularization as placing a (Gaussian) prior on <math>\theta</math> so as to prefer <math>\theta</math> with smaller values.
@@ Line 112: / Line 112: @@
 Minimizing <math>J(\theta)</math> now performs regularized softmax regression.
 == Parameterization ==