Fine-tuning Stacked AEs

@@ Line 36: / Line 36: @@
 Note: While one could consider the softmax classifier as an additional layer, the derivation above does not. Specifically, we consider the "last layer" of the network to be the features that goes into the softmax classifier. Therefore, the derivatives (in Step 2) are computed using <math>\delta^{(n_l)} = - (\nabla_{a^{n_l}}J) \bullet f'(z^{(n_l)})</math>, where  <math>\nabla J = \theta^T(I-P)</math>.
 }}
+{{CNN}}
+{{Languages|微调多层自编码算法|中文}}

Latest revision as of 04:04, 8 April 2013