Softmax回归

From Ufldl

Jump to: navigation, search
(权重衰减 Weight Decay)
(Softmax回归与Logistic 回归的关系 Relationship to Logistic Regression)
Line 163: Line 163:
通过最小化<math>\textstyle J(\theta)</math>,我们就能实现一个可用的softmax回归模型。
通过最小化<math>\textstyle J(\theta)</math>,我们就能实现一个可用的softmax回归模型。
-
==Softmax回归与Logistic 回归的关系 Relationship to Logistic Regression ==
+
==Softmax回归与Logistic 回归的关系==
-
'''原文''':
+
当类别数<math>\textstyle k = 2</math>时,softmax回归退化为logistic回归。这表明softmax回归是logistic回归的一般形式。具体地说,当<math>\textstyle k = 2</math>时,softmax 回归的假设函数为:
-
 
+
-
In the special case where <math>k = 2</math>, one can show that softmax regression reduces to logistic regression.
+
-
This shows that softmax regression is a generalization of logistic regression.  Concretely, when <math>k=2</math>,
+
-
the softmax regression hypothesis outputs
+
<math>
<math>
Line 183: Line 179:
</math>
</math>
-
'''译文''':
 
-
当类别数<math>k = 2</math>时,softmax回归退化为logistic回归。这一点表明了softmax回归是logistic回归的推广形式。具体地说,当<math>k = 2</math>时,softmax 回归的假设函数:
+
利用softmax回归参数冗余的特点,我们令<math>\textstyle \psi = \theta_1</math>,并且从两个参数向量中都减去向量<math>\textstyle \theta_1</math>,得到:
-
 
+
-
<math>
+
-
\begin{align}
+
-
h_\theta(x) &=
+
-
 
+
-
\frac{1}{ e^{\theta_1^Tx}  + e^{ \theta_2^T x^{(i)} } }
+
-
\begin{bmatrix}
+
-
e^{ \theta_1^T x } \\
+
-
e^{ \theta_2^T x }
+
-
\end{bmatrix}
+
-
\end{align}
+
-
</math>
+
-
 
+
-
'''一审''':
+
-
在类别数<math>k = 2</math>的特例中 ,我们会看到softmax回归退化成了logistic 回归。这一点表明了softmax回归是logistic 回归的 一般化形式。具体地说,当<math>k = 2</math>时,softmax回归的估值函数为 :
+
-
 
+
-
<math>
+
-
\begin{align}
+
-
h_\theta(x) &=
+
-
 
+
-
\frac{1}{ e^{\theta_1^Tx}  + e^{ \theta_2^T x^{(i)} } }
+
-
\begin{bmatrix}
+
-
e^{ \theta_1^T x } \\
+
-
e^{ \theta_2^T x }
+
-
\end{bmatrix}
+
-
\end{align}
+
-
</math>
+
-
 
+
-
'''原文''':
+
-
 
+
-
Taking advantage of the fact that this hypothesis
+
-
is overparameterized and setting <math>\psi = \theta_1</math>,
+
-
we can subtract <math>\theta_1</math> from each of the two parameters, giving us
+
<math>
<math>
Line 245: Line 207:
</math>
</math>
-
'''译文''':
 
-
利用 softmax 回归参数冗余的特点,我们设 <math>\psi = \theta_1</math>,在将<math>\theta_1</math>分别从两个参数中减掉,得到:
 
-
 
-
<math>
 
-
\begin{align}
 
-
h(x) &=
 
-
 
-
\frac{1}{ e^{\vec{0}^Tx}  + e^{ (\theta_2-\theta_1)^T x^{(i)} } }
 
-
\begin{bmatrix}
 
-
e^{ \vec{0}^T x } \\
 
-
e^{ (\theta_2-\theta_1)^T x }
 
-
\end{bmatrix} \\
 
-
 
-
 
-
&=
 
-
\begin{bmatrix}
 
-
\frac{1}{ 1 + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
\frac{e^{ (\theta_2-\theta_1)^T x }}{ 1 + e^{ (\theta_2-\theta_1)^T x^{(i)} } }
 
-
\end{bmatrix} \\
 
-
 
-
&=
 
-
\begin{bmatrix}
 
-
\frac{1}{ 1  + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
1 - \frac{1}{ 1  + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
\end{bmatrix}
 
-
\end{align}
 
-
</math>
 
-
 
-
'''一审''':
 
-
 
-
利用估值函数参数冗余的优势,我们令<math>\psi = \theta_1</math>,并且从两个参数向量中都减去向量<math>\theta_1</math>,得到:
 
-
 
-
<math>
 
-
\begin{align}
 
-
h(x) &=
 
-
 
-
\frac{1}{ e^{\vec{0}^Tx}  + e^{ (\theta_2-\theta_1)^T x^{(i)} } }
 
-
\begin{bmatrix}
 
-
e^{ \vec{0}^T x } \\
 
-
e^{ (\theta_2-\theta_1)^T x }
 
-
\end{bmatrix} \\
 
-
 
-
 
-
&=
 
-
\begin{bmatrix}
 
-
\frac{1}{ 1 + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
\frac{e^{ (\theta_2-\theta_1)^T x }}{ 1 + e^{ (\theta_2-\theta_1)^T x^{(i)} } }
 
-
\end{bmatrix} \\
 
-
 
-
&=
 
-
\begin{bmatrix}
 
-
\frac{1}{ 1  + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
1 - \frac{1}{ 1  + e^{ (\theta_2-\theta_1)^T x^{(i)} } } \\
 
-
\end{bmatrix}
 
-
\end{align}
 
-
</math>
 
-
 
-
'''原文''':
 
-
 
-
Thus, replacing <math>\theta_2-\theta_1</math> with a single parameter vector <math>\theta'</math>, we find
 
-
that softmax regression predicts the probability of one of the classes as
 
-
<math>\frac{1}{ 1  + e^{ (\theta')^T x^{(i)} } }</math>,
 
-
and that of the other class as
 
-
<math>1 - \frac{1}{ 1 + e^{ (\theta')^T x^{(i)} } }</math>,
 
-
same as logistic regression.
 
-
 
-
 
-
'''译文''':
 
-
然后,将<math>\theta_2-\theta_1</math>用<math>\theta'</math>来表示,我们发现softmax回归预测其中一个类别的概率为 <math>\frac{1}{ 1  + e^{ (\theta')^T x^{(i)} } }</math>,另一个类别的概率为<math>1 - \frac{1}{ 1 + e^{ (\theta')^T x^{(i)} } }</math> ,这与 logistic回归是一致的。
 
-
'''一审''':
+
因此,用<math>\textstyle \theta'</math>来表示<math>\textstyle \theta_2-\theta_1</math>,我们就会发现softmax回归器预测其中一个类别的概率为<math>\textstyle \frac{1}{ 1  + e^{ (\theta')^T x^{(i)} } }</math>,另一个类别概率的为<math>\textstyle 1 - \frac{1}{ 1 + e^{ (\theta')^T x^{(i)} } }</math>,这与 logistic回归是一致的。
-
于是,将<math>\theta_2-\theta_1</math><math>\theta'</math>来表示,我们发现softmax回归预测其中一个类别的概率为 <math>\frac{1}{ 1  + e^{ (\theta')^T x^{(i)} } }</math>,另一个类别的概率为<math>1 - \frac{1}{ 1 + e^{ (\theta')^T x^{(i)} } }</math>,这与 logistic回归是一致的。
+
==Softmax 回归 vs. k 个二元分类器 Softmax Regression vs. k Binary Classifiers ==
==Softmax 回归 vs. k 个二元分类器 Softmax Regression vs. k Binary Classifiers ==

Revision as of 06:19, 16 March 2013

Personal tools