Linear Decoders
From Ufldl
Line 21: | Line 21: | ||
While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is | While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is | ||
no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range. | no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range. | ||
+ | |||
+ | |||
+ | == Linear Decoder == | ||
One easy fix for this problem is to set <math>a^{(3)} = z^{(3)}</math>. Formally, this is achieved by having the output | One easy fix for this problem is to set <math>a^{(3)} = z^{(3)}</math>. Formally, this is achieved by having the output | ||
Line 58: | Line 61: | ||
Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the | Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the | ||
derivative of the sigmoid (or tanh) function. | derivative of the sigmoid (or tanh) function. | ||
+ | |||
+ | |||
+ | {{Languages|线性解码器|中文}} |