Looking at just the first two PCs discards information present in the features that may relate to the outcome. After all, PCA does not consider the outcome variable.
Below, I give a simulation where the outcome depends on the fourth and fifth principal components. A predictive model that uses all five principal components should be able to perform well in such a situation.
library(ggplot2) set.seed(2025) # Build five correlated features # N <- 1000 p <- 5 X <- matrix(NA, N, p) X[, 1] <- rnorm(N) for (i in 2:p){ X[, i] <- X[, i - 1] + rnorm(N, 0, 1) } # Run PCA and extract the transformed variables # pca <- princomp(X) X_pca <- pca$scores # Simulate an outcome variable (y) that depends on the last two PCs # z <- 5*X_pca[, p] - 5*X_pca[, p - 1] pr <- 1/(1 + exp(-z)) y <- rbinom(N, 1, pr) # Data frame for plotting PCs, colored by group membership # d <- data.frame( y = as.factor(y), PC_1 = X_pca[, 1], PC_2 = X_pca[, 2], PC_3 = X_pca[, 3], PC_4 = X_pca[, 4], PC_5 = X_pca[, 5] ) # Plot the variances # screeplot(pca) # Plot the first two PCs, colored by group # Notice how little separation there is between the groups, despite these # PCs accounting for so much of the total variance in the original features # ggplot(d, aes(x = PC_1, y = PC_2, col = y)) + geom_point() + theme(legend.position = "bottom") # Plot the last two PCs, colored by group # Notice how much separation there is between the groups, despite these # PCs accounting for so little of the total variance in the original features # ggplot(d, aes(x = PC_4, y = PC_5, col = y)) + geom_point() + theme(legend.position = "bottom")
The first two PCs definitely account for much of the total variance.

The first two PCs do not relate to the outcome, however, as the lack of separation shows.

However, there is great separation between the two categories on the last two PCs.

That there is such great separation on the last two PCs speaks to how well the features are separated in the original $5$-dimensional space. A model trained on these original features should be able to pick up on that separation and make predictions with high performance metrics.