Softmax Regression
From Ufldl
(→Introduction) |
|||
Line 1: | Line 1: | ||
== Introduction == | == Introduction == | ||
- | '''Softmax regression''', also known as '''multinomial logistic regression''', is a generalisation of logistic regression to problems where there are more than 2 class labels. | + | '''Softmax regression''', also known as '''multinomial logistic regression''', is a generalisation of logistic regression to problems where there are more than 2 class labels. Recall that in logistic regression, our hypothesis was of the form: |
+ | |||
+ | <math>\begin{align} | ||
+ | h_\theta(x) = \frac{1}{1+\exp(-\theta^Tx)}, | ||
+ | \end{align}</math> | ||
+ | |||
+ | We trained the logistic regression weights to optimize the log-likelihood of the dataset using <math> p(y|x) = h_\theta(x) </math>. In softmax regression, we are interested in multi-class problems where each example is assigned to one of <tt>K</tt> labels. One example of a multi-class classification problem would be classifying digits on the MNIST dataset where each example has label 1 of 10 possible labels (i.e., where it is the digit 0, 1, ... or 9). | ||
+ | |||
+ | To extend the logistic regression framework which only outputs a single probability value, we consider a hypothesis that outputs K values (summing to 1) that represent the predicted probability distribution. | ||
== Mathematical form == | == Mathematical form == |