Sparse Coding: Autoencoder Interpretation

From Ufldl

Jump to: navigation, search

@@ Line 37: / Line 37: @@
 (note that the third term, <math>\lVert A \rVert_2^2</math> is simply the sum of squares of the entries of A, or <math>\sum_r{\sum_c{A_{rc}^2}}</math>)
-This objective function presents one last problem - the L1 norm is not differentiable at 0, and hence poses a problem for gradient-based methods. While the problem can be solved using other non-gradient descent-based methods, we will "smooth out" the L1 norm using an approximation which will allow us to use gradient descent. To "smooth out" the L1 norm, we use <math>\sqrt{x + \epsilon}</math> in place of <math>\left| x \right|</math>, where <math>\epsilon</math> is a "smoothing parameter" which can also be interpreted as a sort of "sparsity parameter" (to see this, observe that when <math>\epsilon</math> is large compared to <math>x</math>, the <math>x + \epsilon</math> is dominated by <math>\epsilon</math>, and taking the square root yields approximately <math>\sqrt{\epsilon}</math>). This "smoothing" will come in handy later when considering topographic sparse coding below.
+This objective function presents one last problem - the L1 norm is not differentiable at 0, and hence poses a problem for gradient-based methods. While the problem can be solved using other non-gradient descent-based methods, we will "smooth out" the L1 norm using an approximation which will allow us to use gradient descent. To "smooth out" the L1 norm, we use <math>\sqrt{x^2 + \epsilon}</math> in place of <math>\left| x \right|</math>, where <math>\epsilon</math> is a "smoothing parameter" which can also be interpreted as a sort of "sparsity parameter" (to see this, observe that when <math>\epsilon</math> is large compared to <math>x</math>, the <math>x + \epsilon</math> is dominated by <math>\epsilon</math>, and taking the square root yields approximately <math>\sqrt{\epsilon}</math>). This "smoothing" will come in handy later when considering topographic sparse coding below.
 Our final objective function is hence:
@@ Line 134: / Line 134: @@
 With this method, you should be able to reach a good local optima relatively quickly.
+{{Languages|稀疏编码自编码表达|中文}}

Sparse Coding: Autoencoder Interpretation

From Ufldl

Latest revision as of 02:49, 19 April 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox