1

I'm using foreach and reading up on it e.g.

My understanding is that you would use %dopar% for parallel processing and %do% for sequential.

As it happens I was having issues with %dopar% and while trying to debug I changed it to a what I thought was a sequential loop using %do%. I happened to have the terminal open and noticed all processors running while I ran the loop.

Is this expected?

Reproducible example:

library(tidyverse) library(caret) library(foreach) # expected to see parallel here because caret and xgb with train() xgbFit <- train(Species ~ ., data = iris, method = "xgbTree", trControl = trainControl(method = "cv", classProbs = TRUE)) iris_big <- do.call(rbind, replicate(1000, iris, simplify = F)) nr <- nrow(iris_big) n <- 1000 # loop over in chunks of 20 pieces <- split(iris_big, rep(1:ceiling(nr/n), each=n, length.out=nr)) lenp <- length(pieces) # did not expect to see parallel processing take place when running the block below predictions <- foreach(i = seq_len(lenp)) %do% { # get prediction preds <- pieces[[i]] %>% mutate(xgb_prediction = predict(xgbFit, newdata = .)) return(preds) } bah <- do.call(rbind, predictions) 

enter image description here

0

1 Answer 1

2

My best guess would be that these are processes still running from previous runs.

It is the same when using foreach::registerDoSeq()?

My second guess would be that predict runs in parallel.

Sign up to request clarification or add additional context in comments.

3 Comments

Looks like it's your second guess and that this is not to do with foreach. I ran predict on iris_big not in a loop and saw parallel processing take place there too. However, only with xgb model. When I change to a e.g. knn model no parallel prediction. Odd since I'd expect parallel could be used with predicting with any model. I'll add caret tag or maybe post another question on why predict uses parallel only with XGB
@DougFir This is really interesting. I would like to know how to force predict to run sequentially as it can mess up with some higher level parallelism that I use in my code.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.