Implementing PCA/Whitening

Revision as of 01:37, 29 April 2011 (view source)

Jngiam (Talk | contribs)

← Older edit

Revision as of 05:28, 29 April 2011 (view source)

Jngiam (Talk | contribs)

Newer edit →

Line 2:

and also describe how you can implement them using efficient linear algebra libraries.

-

First, we need to ensure that the data has zero-mean, ~~that is <math> \frac{1}{m} \sum_{i=1}^m x^{~~(i)~~} = 0 </math>. We achieve this~~ by ~~first centering~~ the ~~dataset, such that it has zero-~~mean ~~on expectation~~. ~~In Matlab, we can do this by using~~

+

First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.

-

avg = mean(x, 2);

+

We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using

-

x = x - repmat(avg~~, 1~~, size(x, 2));

+

avg = mean(x, 1);

+

x = x - repmat(avg, size(x, 1), 1);

Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as

Line 24:

Line 26:

Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:

-

xRot = U(:,1:k)' * x; % k is number of eigenvectors to keep

+

xRot = U(:,1:k)' * x; % k is number of eigenvectors to keep

-

xTilde = U(:,1:k)' * x; % which corresponds to the # dimensions after reduction

+

xTilde = U(:,1:k) * xRot; % which corresponds to the # dimensions after reduction

-

% set k = size(x, 1) to keep all the eigenvectors

+

% set k = size(x, 1) to keep all the eigenvectors

This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.

Implementing PCA/Whitening

From Ufldl

Revision as of 05:28, 29 April 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 2: / Line 2: @@
 and also describe how you can implement them using efficient linear algebra libraries.
-First, we need to ensure that the data has zero-mean, that is <math>  \frac{1}{m} \sum_{i=1}^m x^{(i)} = 0 </math>. We achieve this by first centering the dataset, such that it has zero-mean on expectation. In Matlab, we can do this by using
+First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.
-  avg = mean(x, 2);
+We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
-  x = x - repmat(avg, 1, size(x, 2));
+  avg = mean(x, 1);
+  x = x - repmat(avg, size(x, 1), 1);
 Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as
@@ Line 24: / Line 26: @@
 Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
-  xRot = U(:,1:k)' * x;   % k is number of eigenvectors to keep
+  xRot = U(:,1:k)' * x;     % k is number of eigenvectors to keep
-  xTilde = U(:,1:k)' * x; % which corresponds to the # dimensions after reduction
+  xTilde = U(:,1:k) * xRot; % which corresponds to the # dimensions after reduction
-                         % set k = size(x, 1) to keep all the eigenvectors
+                           % set k = size(x, 1) to keep all the eigenvectors
 This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.