You tend to use the covariance matrix when the variable scales are similar and the correlation matrix when variables are on different scales.
Using the correlation matrix is equivalent to standardisesstandardizing each of the datavariables (to mean 0 and standard deviation 1). In general they, PCA with and without standardizing will give different results. Especially when the scales are different.
As an example, take a look at this R heptathlon data set. Some of the variables have an average value of about 1.8 (the high jump), whereas other variables (run 800m) are around 120.
library(HSAUR) heptathlon[,-8] # look at heptathlon data (excluding 'score' variable) This outputs:
hurdles highjump shot run200m longjump javelin run800m Joyner-Kersee (USA) 12.69 1.86 15.80 22.56 7.27 45.66 128.51 John (GDR) 12.85 1.80 16.23 23.65 6.71 42.56 126.12 Behmer (GDR) 13.20 1.83 14.20 23.10 6.68 44.54 124.20 Sablovskaite (URS) 13.61 1.80 15.23 23.92 6.25 42.78 132.24 Choubenkova (URS) 13.51 1.74 14.76 23.93 6.32 47.46 127.90 ... Now let's do PCA on covariance and on correlation:
# scale=T bases the PCA on the correlation matrix hep.PC.cor = prcomp(heptathlon[,-8], scale=TRUE) hep.PC.cov = prcomp(heptathlon[,-8], scale=FALSE) biplot(hep.PC.cov) biplot(hep.PC.cor) Notice that PCA on covariance is dominated by run800m and javelin: PC1 is almost equal to run800m (and explains $82\%$ of the variance) and PC2 is almost equal to javelin (together they explain $97\%$). PCA on correlation is much more informative and reveals some structure in the data and relationships between variables (but note that the explained variances drop to $64\%$ and $71\%$).
Notice also that the outlying individuals (in this data set) are outliers regardless of whether the covariance or correlation matrix is used.
