Implementing PCA/Whitening

From Ufldl

Jump to: navigation, search
 
Line 6: Line 6:
We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
-
  avg = mean(x, 1);
+
  avg = mean(x, 1);     % Compute the mean pixel intensity value separately for each patch.
  x = x - repmat(avg, size(x, 1), 1);
  x = x - repmat(avg, size(x, 1), 1);
-
Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as  
+
Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as  
  sigma = x * x' / size(x, 2);
  sigma = x * x' / size(x, 2);
Line 26: Line 26:
Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
-
  xRot = U(:,1:k)' * x;     % k is number of eigenvectors to keep
+
  xRot = U' * x;         % rotated version of the data.
-
  xTilde = U(:,1:k) * xRot; % which corresponds to the # dimensions after reduction
+
  xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,
-
                          % set k = size(x, 1) to keep all the eigenvectors
+
                        % where k is the number of eigenvectors to keep
This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.  
This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.  
Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
implementation, and the expressions
implementation, and the expressions
-
above work too for computing <math>x_{rot}</math> and <math>\tilde{x}</math> for your entire training set
+
above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set
all in one go.  The resulting  
all in one go.  The resulting  
-
<math>xrot</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.  
+
<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.  
To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use  
To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use  
Line 49: Line 49:
  xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
  xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
 +
 +
 +
{{PCA}}
 +
 +
 +
{{Languages|实现主成分分析和白化|中文}}

Latest revision as of 13:22, 7 April 2013

Personal tools