Revisions to Calculating Absolute Principal Component Scores from varimax-rotated principal components scores

deleted 32 characters in body; edited tags

edited Dec 11, 2017 at 11:17

109.1k
37
325
350

In many receptor- modellingmodeling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable.

This is done by:

calculate the z-score for absolute zero concentrations (i.e. take a vector with all zeroes, subtract the sample mean and divide by the sample variance);
calculate the rotated absolute zero PC scores for each component for this z-scored absolute zero from step 1;
subtractingsubtract the "zero" PC score (from 2) from the true scores.

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)...I tried to follow the procedure but still had negative values... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. ):

irisX <- iris[2:nrow(iris)iris[, 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsampleartificial sample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...)    #now#step 3: now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logicThis results in

> head(ACPS) [,1] [,2] [1,] 4.274291 -9.339044 [2,] 3.980231 -8.167430 [3,] 3.937934 -8.548838 [4,] 3.886160 -8.284854 [5,] 4.262470 -9.527271 [6,] 4.724111 -10.296421

and interpretation of APCS is much appreciated, thanks!one can see that there are still negative values in the ACPS data. Why?

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...)    #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

In many receptor-modeling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable.

This is done by:

calculate the z-score for absolute zero concentrations (i.e. take a vector with all zeroes, subtract the sample mean and divide by the sample variance);
calculate the rotated PC scores for each component for this z-scored absolute zero from step 1;
subtract the "zero" PC score (from 2) from the true scores.

I tried to follow the procedure but still had negative values... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion):

irisX <- iris[,1:4] ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificial sample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #step 3: now to calculate my new "APCS" scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

This results in

> head(ACPS) [,1] [,2] [1,] 4.274291 -9.339044 [2,] 3.980231 -8.167430 [3,] 3.937934 -8.548838 [4,] 3.886160 -8.284854 [5,] 4.262470 -9.527271 [6,] 4.724111 -10.296421

and one can see that there are still negative values in the ACPS data. Why?

r multiple-regression regression pca data-transformation factor-rotation

replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/

Source Link

edited Apr 13, 2017 at 12:44

Community Bot

1

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

added 54 characters in body

Source Link

edited Aug 14, 2015 at 22:26

sor

21
3

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

In many receptor- modelling studies, after performing the PCA analysis, they often "rescale" their varimax-rotated PC scores (which are standardized with mean zero and standard deviaiton of 1) to something called an absolute principal component scores (APCS) before performing the MLR so that you can estimate the source contributions from each factor in terms of your independent variable. This is done by:

calculate the z-score for absolute zero concentrations
calculate the rotated absolute zero PC scores for each component
subtracting the "zero" PC score (from 2) from the true scores

I tried to follow the procedure but still had negative values.. For some odd reason, when I substituted my data set with the iris data, it worked (maybe for the wrong reason)... Can someone take a look and see where did I go wrong? Here is the code using the iris data set (most of the code originated from a previous discussion. )

irisX <- iris[2:nrow(iris), 1:4] #APCS becomes negative when you include the first row ncomp <- 2 pca_iris <- prcomp(irisX , center=T, scale=T) rawLoadings <- pca_iris$rotation[,1:ncomp] %*% diag(pca_iris$sdev, ncomp, ncomp) rotatedLoadings <- varimax(rawLoadings)$loadings invLoadings <- t(pracma::pinv(rotatedLoadings)) scores <- scale(irisX) %*% invLoadings # my scores from rotated loadings which are standardized # want to use APCS to do MLR instead of these scores #step 1: create artificialsample with zero concentrations for all variables z0i <- matrix(-colMeans(irisX)/sqrt(apply(irisX, 2, var)), nrow = 1) #step 2: find its rotated PC score by multiplying the transposed rotated loading from the original sample scores0 <- as.numeric(z0i %*% invLoadings) # my absolute zero PC scores (supposedly...) #now to calculate my new "APCS" .. which worked (for the iris data but not for my study).. scores0 <- matrix(rep((scores0), each = nrow(scores)),nrow = nrow(scores)) ACPS <- scores - scores0

I couldn't figure out how to post my data, so a review of my logic and interpretation of APCS is much appreciated, thanks!

Source Link

asked Aug 13, 2015 at 23:12

sor

21
3

Loading

Stack Exchange Network

Return to Question