自编码算法与稀疏性

@@ Line 194: / Line 194: @@
 【一审】
+为了达到这样的目的，我们将为我们的优化目标另外加入一个惩罚项，也即对与 <math>\textstyle \rho</math> 差别很大的 <math>\textstyle \hat\rho_j</math> 进行惩罚。惩罚项的选择有很多种，它们都可以得到很不错的效果。这里我们选择以下方式：
+:<math>\begin{align}
+\sum_{j=1}^{s_2} \rho \log \frac{\rho}{\hat\rho_j} + (1-\rho) \log \frac{1-\rho}{1-\hat\rho_j}.
+\end{align}</math>
+这里， <math>\textstyle s_2</math> 是隐藏层神经元的数量，通过索引 <math>\textstyle j</math> 将对整个神经网络的隐藏单元加总。如果你了解KL距离的话，这个惩罚项就是以它为基础的，上式还可写成以下方式：
+:<math>\begin{align}
+\sum_{j=1}^{s_2} {\rm KL}(\rho || \hat\rho_j),
+\end{align}</math>
+这里 <math>\textstyle {\rm KL}(\rho || \hat\rho_j)
+ = \rho \log \frac{\rho}{\hat\rho_j} + (1-\rho) \log \frac{1-\rho}{1-\hat\rho_j}</math> 就叫作Kullback-Leibler（KL）距离，它表示的是分别具有平均值 <math>\textstyle \rho</math> 和 <math>\textstyle \hat\rho_j</math> ，服从贝努利分布的两个随机变量之间的差距。KL距离是度量两种分布差异度的项函数。（如果你之前没接触过KL距离，也不要紧，所有你需要知道的东西都包含在本教程中。）
 【二审】

Revision as of 13:26, 12 March 2013