神经网络

Revision as of 12:40, 8 March 2013 (view source)

Kandeng (Talk | contribs)

← Older edit

Revision as of 12:53, 8 March 2013 (view source)

Kandeng (Talk | contribs)

Newer edit →

Line 55:

【二审】因此，这个单一“神经元”的输入输出的映射关系其实就是一个逻辑回归。

-

【原文】Although these notes will use the sigmoid function, it is worth noting that another common choice for f is the hyperbolic tangent, or tanh, function:

+

【原文】Although these notes will use the sigmoid function, it is worth noting that another common choice for <math>f</math> is the hyperbolic tangent, or tanh, function:

+

:<math>

+

f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},

+

</math>

【初译】尽管我们在这里使用了S型函数，也可以使用双曲正切函数，用tanh表示：

+

:<math>

+

f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},

+

</math>

【一审】虽然本系列教程将采用Sigmoid函数，但其它的选择还有双曲正切函数：

+

:<math>

+

f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},

+

</math>

【二审】虽然本系列教程将采用Sigmoid函数，但你还可以选择双曲正切函数（tanh）

-

【原文】Here are plots of the sigmoid and tanh functions:~~（一审注：这里sigmoid与tanh是区分开来的，所以sigmoid不是S型函数的总称）~~

+

:<math>

+

f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},

+

</math>

+

【原文】Here are plots of the sigmoid and <math>\tanh</math> functions:

【初译】下面为S型函数图和双曲正切函数图：

-

~~【一审】以下是Sigmoid函数及双曲正切函数的图形：~~

+

【一审】以下是Sigmoid函数及双曲正切函数的图形：（一审注：这里sigmoid与tanh是区分开来的，所以sigmoid不是S型函数的总称）

【二审】以下是Sigmoid函数及tanh函数的图形：(二审注：在翻译中，既然可以用Sigmoid表示一种函数，就可以用tanh表示双曲正切函数，毕竟它们都是很特殊的函数，并且被广泛使用的)

-

【原文】The tanh(z) function is a rescaled version of the sigmoid, and its output range is [ − 1,1] instead of [0,1].

+

+

[[Image:Sigmoid_Function.png|400px|top|Sigmoid activation function.]]

+

[[Image:Tanh_Function.png|400px|top|Tanh activation function.]]

+

</div>

+

【原文】The <math>\tanh(z)</math> function is a rescaled version of the sigmoid, and its output range is <math>[-1,1]</math> instead of <math>[0,1]</math>.

-

~~【初译】tanh~~(z) 是S型函数的变形，输出范围为 [ − 1,1] ，而不是[0,1]。

+

【初译】<math>\tanh(z)</math> 是S型函数的变形，输出范围为<math>[-1,1]</math>，而不是<math>[0,1]</math>。

-

~~【一审】tanh~~(z) 函数是sigmoid函数的一种变体，它的取值范围为[~~－1，1~~]，而不是[~~0，1~~]。

+

【一审】<math>\tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为<math>[-1,1]</math>，而不是<math>[0,1]</math>。

-

~~【二审】tanh~~(z) 函数是sigmoid函数的一种变体，它的取值范围为[~~－1，1~~]，而不是sigmoid函数的[~~0，1~~]。

+

【二审】<math>\tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为<math>[-1,1]</math>，而不是sigmoid函数的<math>[0,1]</math>。

-

【原文】Note that unlike some other venues (including the OpenClassroom videos, and parts of CS229), we are not using the convention here of x0 = 1. Instead, the intercept term is handled separately by the parameter b.

+

【原文】Note that unlike some other venues (including the OpenClassroom videos, and parts of CS229), we are not using the convention here of <math>x_0=1</math>. Instead, the intercept term is handled separately by the parameter b.

-

~~【初译】不同于其他的情况（在开放性课程视频CS229中），我们不再令x0~~ = ~~1。截距项通过参数b来单独处理。~~

+

【初译】不同于其他的情况（在开放性课程视频CS229中），我们不再令<math>x_0=1</math>。截距项通过参数<math>b</math>来单独处理。

-

~~【一审】注意，与其它地方（包括公开课程视频及教学讲义CS229）不同的是，这里我们并不令x0~~ = ~~1，而是通过一个单独的参数b来表示截距。~~

+

【一审】注意，与其它地方（包括公开课程视频及教学讲义CS229）不同的是，这里我们并不令<math>x_0=1</math>，而是通过一个单独的参数<math>b</math>来表示截距。

-

~~【二审】注意，与其它地方（包括一些公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令x0~~ = ~~1，而是通过一个单独的参数b来表示。~~

+

【二审】注意，与其它地方（包括一些公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令<math>x_0=1</math>，而是通过一个单独的参数<math>b</math>来表示。

【原文】Finally, one identity that'll be useful later: If f(z) = 1 / (1 + exp( − z)) is the sigmoid function, then its derivative is given by f'(z) = f(z)(1 − f(z)). (If f is the tanh function, then its derivative is given by f'(z) = 1 − (f(z))2.) You can derive this yourself using the definition of the sigmoid (or tanh) function.

From Ufldl

Revision as of 12:53, 8 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 55: / Line 55: @@
 【二审】因此，这个单一“神经元”的输入输出的映射关系其实就是一个逻辑回归。
-【原文】Although these notes will use the sigmoid function, it is worth noting that another common choice for f is the hyperbolic tangent, or tanh, function:
+【原文】Although these notes will use the sigmoid function, it is worth noting that another common choice for <math>f</math> is the hyperbolic tangent, or tanh, function:
+:<math>
+f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},
+</math>
 【初译】尽管我们在这里使用了S型函数，也可以使用双曲正切函数，用tanh表示：
+:<math>
+f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},
+</math>
 【一审】虽然本系列教程将采用Sigmoid函数，但其它的选择还有双曲正切函数：
+:<math>
+f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},
+</math>
 【二审】虽然本系列教程将采用Sigmoid函数，但你还可以选择双曲正切函数（tanh）
-【原文】Here are plots of the sigmoid and tanh functions:（一审注：这里sigmoid与tanh是区分开来的，所以sigmoid不是S型函数的总称）
+:<math>
+f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},
+</math>
+【原文】Here are plots of the sigmoid and <math>\tanh</math>  functions:
 【初译】下面为S型函数图和双曲正切函数图：
-【一审】以下是Sigmoid函数及双曲正切函数的图形：
+【一审】以下是Sigmoid函数及双曲正切函数的图形：（一审注：这里sigmoid与tanh是区分开来的，所以sigmoid不是S型函数的总称）
 【二审】以下是Sigmoid函数及tanh函数的图形：(二审注：在翻译中，既然可以用Sigmoid表示一种函数，就可以用tanh表示双曲正切函数，毕竟它们都是很特殊的函数，并且被广泛使用的)
-【原文】The tanh(z) function is a rescaled version of the sigmoid, and its output range is [ − 1,1] instead of [0,1].
+<div align=center>
+[[Image:Sigmoid_Function.png|400px|top|Sigmoid activation function.]]
+[[Image:Tanh_Function.png|400px|top|Tanh activation function.]]
+</div>
+【原文】The <math>\tanh(z)</math> function is a rescaled version of the sigmoid, and its output range is <math>[-1,1]</math> instead of <math>[0,1]</math>.
-【初译】tanh(z) 是S型函数的变形，输出范围为 [ − 1,1] ，而不是[0,1]。
+【初译】<math>\tanh(z)</math> 是S型函数的变形，输出范围为<math>[-1,1]</math>，而不是<math>[0,1]</math>。
-【一审】tanh(z) 函数是sigmoid函数的一种变体，它的取值范围为[－1，1]，而不是[0，1]。
+【一审】<math>\tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为<math>[-1,1]</math>，而不是<math>[0,1]</math>。
-【二审】tanh(z) 函数是sigmoid函数的一种变体，它的取值范围为[－1，1]，而不是sigmoid函数的[0，1]。
+【二审】<math>\tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为<math>[-1,1]</math>，而不是sigmoid函数的<math>[0,1]</math>。
-【原文】Note that unlike some other venues (including the OpenClassroom videos, and parts of CS229), we are not using the convention here of x0 = 1. Instead, the intercept term is handled separately by the parameter b.
+【原文】Note that unlike some other venues (including the OpenClassroom videos, and parts of CS229), we are not using the convention here of <math>x_0=1</math>. Instead, the intercept term is handled separately by the parameter b.
-【初译】不同于其他的情况（在开放性课程视频CS229中），我们不再令x0 = 1。截距项通过参数b来单独处理。
+【初译】不同于其他的情况（在开放性课程视频CS229中），我们不再令<math>x_0=1</math>。截距项通过参数<math>b</math>来单独处理。
-【一审】注意，与其它地方（包括公开课程视频及教学讲义CS229）不同的是，这里我们并不令x0 = 1，而是通过一个单独的参数b来表示截距。
+【一审】注意，与其它地方（包括公开课程视频及教学讲义CS229）不同的是，这里我们并不令<math>x_0=1</math>，而是通过一个单独的参数<math>b</math>来表示截距。
-【二审】注意，与其它地方（包括一些公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令x0 = 1，而是通过一个单独的参数b来表示。
+【二审】注意，与其它地方（包括一些公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令<math>x_0=1</math>，而是通过一个单独的参数<math>b</math>来表示。
 【原文】Finally, one identity that'll be useful later: If f(z) = 1 / (1 + exp( − z)) is the sigmoid function, then its derivative is given by f'(z) = f(z)(1 − f(z)). (If f is the tanh function, then its derivative is given by f'(z) = 1 − (f(z))2.) You can derive this yourself using the definition of the sigmoid (or tanh) function.