稀疏编码
From Ufldl
Line 188: | Line 188: | ||
:<math>\begin{align} | :<math>\begin{align} | ||
\mathbf{\phi}^{*'}=\text{argmax}_{\mathbf{\phi}} < \max_{\mathbf{a}} \log(P(\mathbf{x} \mid \mathbf{\phi})) > | \mathbf{\phi}^{*'}=\text{argmax}_{\mathbf{\phi}} < \max_{\mathbf{a}} \log(P(\mathbf{x} \mid \mathbf{\phi})) > | ||
+ | \end{align}</math> | ||
+ | |||
+ | As before, we may increase the estimated probability by scaling down <math>a_i</math> and scaling up <math>\mathbf{\phi}</math> (since <math>P(a_i)</math> peaks about zero) , we therefore impose a norm constraint on our features <math>\mathbf{\phi}</math> to prevent this. | ||
+ | |||
+ | 【初译】如前所述,增加估计的概率通过缩小 <math>a_i</math> 和放大 <math>\mathbf{\phi}</math> (因为<math>P(a_i)</math> 在零点得到峰值),因此,在特征 <math>\mathbf{\phi}</math> 上增加了一个范数约束避免这种可能。 | ||
+ | |||
+ | 【一审】跟之前一样,我们就可以通过减小 <math>a_i</math> 或增大 <math>\mathbf{\phi}</math> 来增加概率的估算值(因为<math>P(a_i)</math> 在零值附近陡升)。因此我们要对特征向量 <math>\mathbf{\phi}</math> 加一个限制以防止这种情况发生。 | ||
+ | |||
+ | Finally, we can recover our original cost function by defining the energy function of this linear generative model | ||
+ | |||
+ | 【初译】最后,通过定义线性生成模型的能量函数覆盖最初的代价函数。 | ||
+ | |||
+ | 【一审】最后,我们就可以通过定义一种线性生成模型的能量函数来将原先的代价函数重新表述为: | ||
+ | |||
+ | :<math>\begin{array}{rl} | ||
+ | E\left( \mathbf{x} , \mathbf{a} \mid \mathbf{\phi} \right) & := -\log \left( P(\mathbf{x}\mid \mathbf{\phi},\mathbf{a}\right)P(\mathbf{a})) \\ | ||
+ | &= \sum_{j=1}^{m} \left|\left| \mathbf{x}^{(j)} - \sum_{i=1}^k a^{(j)}_i \mathbf{\phi}_{i}\right|\right|^{2} + \lambda \sum_{i=1}^{k}S(a^{(j)}_i) | ||
+ | \end{array}</math> | ||
+ | |||
+ | where <math>\lambda = 2\sigma^2\beta</math> and irrelevant constants have been hidden. Since maximizing the log-likelihood is equivalent to minimizing the energy function, we recover the original optimization problem: | ||
+ | |||
+ | 【初译】此处, <math>\lambda = 2\sigma^2\beta</math> 且不相关的常量已被隐藏。因为最大化对数似然等价于最小化能量函数,覆盖最初的的优化问题: | ||
+ | |||
+ | 【一审】此处, <math>\lambda = 2\sigma^2\beta</math> ,其中关系不大的常量已被隐藏起来。因为最大化对数似然函数等同于最小化能量函数,我们将原先的优化问题重新表述为: | ||
+ | |||
+ | :<math>\begin{align} | ||
+ | \mathbf{\phi}^{*},\mathbf{a}^{*}=\text{argmin}_{\mathbf{\phi},\mathbf{a}} \sum_{j=1}^{m} \left|\left| \mathbf{x}^{(j)} - \sum_{i=1}^k a^{(j)}_i \mathbf{\phi}_{i}\right|\right|^{2} + \lambda \sum_{i=1}^{k}S(a^{(j)}_i) | ||
\end{align}</math> | \end{align}</math> |