2
$\begingroup$

I know this can vary, but in the standard setup, when calculating principal components (PCs), we begin by subtracting the means of each feature (dimension), but we do not divide by the standard deviation.

Why is the zero-centering necessary, but setting standard deviations to $1$ not necessary?

A follow-up question: suppose I calculate the the PCs offline, and then use these hard-coded coefficients for noise reduction on streaming (live) data. What is the mean I should use on the live data? The one from the historical data that was used to derive the coefficients in the first place, or the mean of the live (online) data?

$\endgroup$
1

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.