I'm using foreach and reading up on it e.g.
- https://www.r-bloggers.com/the-wonders-of-foreach/
- https://www.rdocumentation.org/packages/foreach/versions/1.4.3/topics/foreach
My understanding is that you would use %dopar% for parallel processing and %do% for sequential.
As it happens I was having issues with %dopar% and while trying to debug I changed it to a what I thought was a sequential loop using %do%. I happened to have the terminal open and noticed all processors running while I ran the loop.
Is this expected?
Reproducible example:
library(tidyverse) library(caret) library(foreach) # expected to see parallel here because caret and xgb with train() xgbFit <- train(Species ~ ., data = iris, method = "xgbTree", trControl = trainControl(method = "cv", classProbs = TRUE)) iris_big <- do.call(rbind, replicate(1000, iris, simplify = F)) nr <- nrow(iris_big) n <- 1000 # loop over in chunks of 20 pieces <- split(iris_big, rep(1:ceiling(nr/n), each=n, length.out=nr)) lenp <- length(pieces) # did not expect to see parallel processing take place when running the block below predictions <- foreach(i = seq_len(lenp)) %do% { # get prediction preds <- pieces[[i]] %>% mutate(xgb_prediction = predict(xgbFit, newdata = .)) return(preds) } bah <- do.call(rbind, predictions) 