Feature extraction using convolution

From Ufldl

Jump to: navigation, search
(Locally Connected Networks)
(Convolutions)
Line 55: Line 55:
While in principle one can learn feature convolutionally over the entire image, the learning procedure becomes more complicated to implement and often takes longer to execute.  
While in principle one can learn feature convolutionally over the entire image, the learning procedure becomes more complicated to implement and often takes longer to execute.  
!-->
!-->
 +
 +
【初译】卷积
 +
----
 +
 +
固有图像的一种特征是它具有固定性,也就是说,对图像的这一部分作的统计跟对图像的那一部分做的统计是完全一样的。这就意味着,我们从图像的某一部分学习到的特征能被应用到该图像的其他部分去,所以对于这个图像上的所有位置,我们都能使用同样已经学习到的特征。
 +
 +
【一审】卷积
 +
----
 +
 +
自然图像有其固有特性,也就是说,图像的一部分的统计特性与其他部分是一样的。这也意味着我们在这一部分学习的特征也能用在另一部分上,所以对于这个图像上的所有位置,我们都能使用同样的学习特征。
More precisely, having learned features over small (say 8x8) patches sampled randomly from the larger image, we can then apply this learned 8x8 feature detector anywhere in the image.  Specifically, we can take the learned 8x8 features and  
More precisely, having learned features over small (say 8x8) patches sampled randomly from the larger image, we can then apply this learned 8x8 feature detector anywhere in the image.  Specifically, we can take the learned 8x8 features and  
'''convolve''' them with the larger image, thus obtaining a different feature activation value at each location in the image.   
'''convolve''' them with the larger image, thus obtaining a different feature activation value at each location in the image.   
 +
 +
【初译】
 +
----
 +
 +
更恰当的解释是,当从一个大尺寸图像中随机选取一小块,比如说8x8作为样本,并且从这个小块样本中学习到了一些特征,这时我们可以把从这个8x8样本中学习到的特征作为探测器,应用到这个图像的任意地方中去。特别是,我们可以用从8x8样本中所学习到的特征跟原本的大尺寸图像作卷积,从而对这个大尺寸图像上的任一位置获得一个不同特征的激活值。
 +
 +
【一审】
 +
----
 +
 +
更恰当的解释是,当从一个大尺寸图像中随机选取一小块,比如说8x8作为样本,并且从这个小块样本中学习到了一些特征,这时我们可以把从这个8x8样本中学习到的特征作为探测器,应用到这个图像的任意地方中去。特别是,我们可以用从8x8样本中所学习到的特征跟原本的大尺寸图像作卷积,从而对这个大尺寸图像上的任一位置获得一个不同特征的激活值。
To give a concrete example, suppose you have learned features on 8x8 patches sampled from a 96x96 image.  Suppose further this was done with an autoencoder that has 100 hidden units.  To get the convolved features, for every 8x8 region of the 96x96 image, that is, the 8x8 regions starting at <math>(1, 1), (1, 2), \ldots (89, 89)</math>, you would extract the 8x8 patch, and run it through your trained sparse autoencoder to get the feature activations.  This would result in 100 sets 89x89 convolved features.   
To give a concrete example, suppose you have learned features on 8x8 patches sampled from a 96x96 image.  Suppose further this was done with an autoencoder that has 100 hidden units.  To get the convolved features, for every 8x8 region of the 96x96 image, that is, the 8x8 regions starting at <math>(1, 1), (1, 2), \ldots (89, 89)</math>, you would extract the 8x8 patch, and run it through your trained sparse autoencoder to get the feature activations.  This would result in 100 sets 89x89 convolved features.   
 +
 +
【初译】
 +
----
 +
 +
下面给出一个具体的例子:假设你已经从一个96x96的图像中学习到了它的一个8x8样本所具有的特征,更近一步的假设已经给100个隐含单元做好了自编码。为了得到做完卷积后的特征,对于一个96x96的图像,它的每个8x8的小块图像区域,这个区域的起始坐标可以是从(1,1),(1,2),...,一直到(89,89),你可以抽取8x8的小块区域,然后对抽取的区域逐个运行训练过的稀疏自编码来得到特征的激活值。在这个例子里,显然可以得到100个集合,每个集合含有89x89个卷积后的特征。
 +
 +
【一审】
 +
----
 +
 +
下面给出一个具体的例子:假设你已经从一个96x96的图像中学习到了它的一个8x8的样本所具有的特征,假设这是由有100个隐含单元的自编码完成的。为了得到卷积特征,需要对96x96的图像的每个8x8的小块图像区域都进行卷积运算。也就是说,抽取8x8的小块区域,并且从起始坐标开始依次标记为(1,1),(1,2),...,一直到(89,89),然后对抽取的区域逐个运行训练过的稀疏自编码来得到特征的激活值。在这个例子里,显然可以得到100个集合,每个集合含有89x89个卷积特征。
<!--
<!--
Line 68: Line 98:
Formally, given some large <math>r \times c</math> images <math>x_{large}</math>, we first train a sparse autoencoder on small <math>a \times b</math> patches <math>x_{small}</math> sampled from these images, learning <math>k</math> features <math>f = \sigma(W^{(1)}x_{small} + b^{(1)})</math> (where <math>\sigma</math> is the sigmoid function), given by the weights <math>W^{(1)}</math> and biases <math>b^{(1)}</math> from the visible units to the hidden units. For every <math>a \times b</math> patch <math>x_s</math> in the large image, we compute <math>f_s = \sigma(W^{(1)}x_s + b^{(1)})</math>, giving us <math>f_{convolved}</math>, a <math>k \times (r - a + 1) \times (c - b + 1)</math> array of convolved features.  
Formally, given some large <math>r \times c</math> images <math>x_{large}</math>, we first train a sparse autoencoder on small <math>a \times b</math> patches <math>x_{small}</math> sampled from these images, learning <math>k</math> features <math>f = \sigma(W^{(1)}x_{small} + b^{(1)})</math> (where <math>\sigma</math> is the sigmoid function), given by the weights <math>W^{(1)}</math> and biases <math>b^{(1)}</math> from the visible units to the hidden units. For every <math>a \times b</math> patch <math>x_s</math> in the large image, we compute <math>f_s = \sigma(W^{(1)}x_s + b^{(1)})</math>, giving us <math>f_{convolved}</math>, a <math>k \times (r - a + 1) \times (c - b + 1)</math> array of convolved features.  
 +
In the next section, we further describe how to "pool" these features together to get even better features for classification.
In the next section, we further describe how to "pool" these features together to get even better features for classification.
 +
【初译】
 +
----
 +
 +
在接下来的章节里,我们会更进一步描述如何把这些特征汇总到一起,得到更好的一些特征来方便对特征进行分类。
 +
 +
【一审】
 +
----
 +
 +
在接下来的章节里,我们会更进一步描述如何把这些特征汇总到一起以得到一些更利于分类的特征。

Revision as of 03:01, 9 March 2013

Personal tools