PCA
From Ufldl
m (sigma bug - should be 1/m xx^T, not just xx^T) |
|||
Line 201: | Line 201: | ||
approximation to the data. | approximation to the data. | ||
- | To decide how to set <math>\textstyle k</math>, we will usually look at the | + | To decide how to set <math>\textstyle k</math>, we will usually look at the '''percentage of variance |
- | retained | + | retained''' for different values of <math>\textstyle k</math>. Concretely, if <math>\textstyle k=n</math>, then we have |
- | an exact approximation to the data, and we say that 100 | + | an exact approximation to the data, and we say that 100% of the variance is |
retained. I.e., all of the variation of the original data is retained. | retained. I.e., all of the variation of the original data is retained. | ||
Conversely, if <math>\textstyle k=0</math>, then we are approximating all the data with the zero vector, | Conversely, if <math>\textstyle k=0</math>, then we are approximating all the data with the zero vector, | ||
- | and thus 0 | + | and thus 0% of the variance is retained. |
More generally, let <math>\textstyle \lambda_1, \lambda_2, \ldots, \lambda_n</math> be the eigenvalues | More generally, let <math>\textstyle \lambda_1, \lambda_2, \ldots, \lambda_n</math> be the eigenvalues | ||
Line 217: | Line 217: | ||
In our simple 2D example above, <math>\textstyle \lambda_1 = 7.29</math>, and <math>\textstyle \lambda_2 = 0.69</math>. Thus, | In our simple 2D example above, <math>\textstyle \lambda_1 = 7.29</math>, and <math>\textstyle \lambda_2 = 0.69</math>. Thus, | ||
by keeping only <math>\textstyle k=1</math> principal components, we retained <math>\textstyle 7.29/(7.29+0.69) = 0.913</math>, | by keeping only <math>\textstyle k=1</math> principal components, we retained <math>\textstyle 7.29/(7.29+0.69) = 0.913</math>, | ||
- | or 91.3 | + | or 91.3% of the variance. |
A more formal definition of percentage of variance retained is beyond the scope | A more formal definition of percentage of variance retained is beyond the scope | ||
Line 229: | Line 229: | ||
and for which we would incur a greater approximation error if we were to set them to zero. | and for which we would incur a greater approximation error if we were to set them to zero. | ||
- | In the case of images, one common heuristic is to choose <math>\textstyle k</math> so as to retain 99 | + | In the case of images, one common heuristic is to choose <math>\textstyle k</math> so as to retain 99% of |
the variance. In other words, we pick the smallest value of <math>\textstyle k</math> that satisfies | the variance. In other words, we pick the smallest value of <math>\textstyle k</math> that satisfies | ||
:<math>\begin{align} | :<math>\begin{align} | ||
Line 235: | Line 235: | ||
\end{align}</math> | \end{align}</math> | ||
Depending on the application, if you are willing to incur some | Depending on the application, if you are willing to incur some | ||
- | additional error, values in the 90-98 | + | additional error, values in the 90-98% range are also sometimes used. When you |
- | describe to others how you applied PCA, saying that you chose <math>\textstyle k</math> to retain 95 | + | describe to others how you applied PCA, saying that you chose <math>\textstyle k</math> to retain 95% of |
the variance will also be a much more easily interpretable description than saying | the variance will also be a much more easily interpretable description than saying | ||
that you retained 120 (or whatever other number of) components. | that you retained 120 (or whatever other number of) components. |