Deriving gradients using the backpropagation idea
From Ufldl
(Created page with "== Introduction == In the section on the backpropagation algorithm, you were briefly introduced to backpropagation as a means of deriving gradien...") |
|||
Line 26: | Line 26: | ||
<li><math>a^{(l)}_i</math> is the activation of the <math>i</math>th unit in the <math>l</math>th layer | <li><math>a^{(l)}_i</math> is the activation of the <math>i</math>th unit in the <math>l</math>th layer | ||
<li><math>A \bullet B</math> is the Hadamard or element-wise product, which for <math>r \times c</math> matrices <math>A</math> and <math>B</math> yields the <math>r \times c</math> matrix <math>C = A \bullet B</math> such that <math>C_{r, c} = A_{r, c} \cdot B_{r, c}</math> | <li><math>A \bullet B</math> is the Hadamard or element-wise product, which for <math>r \times c</math> matrices <math>A</math> and <math>B</math> yields the <math>r \times c</math> matrix <math>C = A \bullet B</math> such that <math>C_{r, c} = A_{r, c} \cdot B_{r, c}</math> | ||
- | <li><math>f^{(l | + | <li><math>f^{(l)}</math> is the activation function for units in the <math>l</math>th layer |
</ul> | </ul> | ||
Line 51: | Line 51: | ||
</ol> | </ol> | ||
- | [[File:Backpropagation Method Example 1.png]] | + | [[File:Backpropagation Method Example 1.png | 400px]] |
=== Example 2: Smoothed topographic L1 sparsity penalty in sparse coding === | === Example 2: Smoothed topographic L1 sparsity penalty in sparse coding === | ||
Line 60: | Line 60: | ||
We would like to find <math>\nabla_s \sum{ \sqrt{Vss^T + \epsilon} }</math>. As above, let's see this term as an instantiation of a neural network: | We would like to find <math>\nabla_s \sum{ \sqrt{Vss^T + \epsilon} }</math>. As above, let's see this term as an instantiation of a neural network: | ||
- | [[File:Backpropagation Method Example 2.png]] | + | [[File:Backpropagation Method Example 2.png | 600px]] |