Implementing PCA/Whitening

From Ufldl

Jump to: navigation, search
Line 2: Line 2:
and also describe how you can implement them using efficient linear algebra libraries.
and also describe how you can implement them using efficient linear algebra libraries.
-
First, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as  
+
First, we need to ensure that the data has zero-mean, that is <math>  \frac{1}{m} \sum_{i=1}^m (x^{(i)}) = 0 </math>. We achieve this by first centering the dataset, such that it has zero-mean on expectation. In Matlab, we can do this by using
 +
 
 +
avg = mean(x, 2);
 +
x = x - repmat(avg, 1, size(x, 2));
 +
 
 +
we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as  
  sigma = x * x' / size(x, 2);
  sigma = x * x' / size(x, 2);

Revision as of 01:33, 29 April 2011

Personal tools