Sparse Coding: Autoencoder Interpretation
From Ufldl
Line 47: | Line 47: | ||
(where <math>\sqrt{s^2 + \epsilon}</math> is shorthand for <math>\sum_k{\sqrt{s_k^2 + \epsilon}}</math>) | (where <math>\sqrt{s^2 + \epsilon}</math> is shorthand for <math>\sum_k{\sqrt{s_k^2 + \epsilon}}</math>) | ||
- | + | This objective function can then be optimized iteratively, using the following procedure: | |
+ | <ol> | ||
+ | <li>Initialize <math>A</math> randomly | ||
+ | <li>Repeat until convergence | ||
+ | <ol> | ||
+ | <li>Find <math>s</math> that minimizes <math>J(A, s)</math> for the <math>A</math> found in the previous step | ||
+ | <li>Find <math>A</math> that minimizes <math>J(A, s)</math> for the <math>s</math> found in the previous step | ||
+ | </ol> | ||
+ | </ol> | ||
+ | |||
+ | Observe that with our modified objective function, the objective function <math>J(A, s)</math> given <math>s</math>, that is <math>J(A; s) = \lVert As - x \rVert_2^2 + \gamma \lVert A \rVert_2^2</math> (the L1 term in <math>s</math> can be omitted since it is not a function of <math>A</math>) is simply a quadratic function of <math>A</math>, and hence has an easily derivable analytic solution in <math>A</math>. A quick way to derive this solution would be to use matrix calculus - some pages about matrix calculus can be found in the [[useful links]] section. | ||
+ | |||
+ | Optimizing for this objective function using the iterative method as above, will yield features (the basis vectors of <math>A</math>) similar to those learned using the sparse autoencoder. For more practical tips on implementing sparse coding, you may wish to refer to [[Exercise:Sparse Coding | the sparse coding exercise]]. For assistance with deriving the gradients, you may wish to refer to [[Deriving gradients using the backpropagation idea]]. | ||
== Topographic sparse coding == | == Topographic sparse coding == |