1

I'm using doParallel to do fairly long parallel processing with foreach. Rather than most examples I see, where a computationally-intensive but input-light code is fed into the loop, I'm using foreach to coordinate the simultaneous processing of a number of large, independent datasets. So inside the loop, I'm using metadata to read in a file from disk, operate on it, and write back out.

Before I turned this operation into a foreach loop, I was writing out debug messages using messages(). However, since I've switched to using foreach and %dopar%, I've noticed that the loop 'goes dark': it's doing what it ought to, but I'm not receiving any output. (I should mention that this loop is written into a script that I'm calling from the shell with Rscript.)

I'm guessing that this has something to do with the fact that doParallel spins off other threads—maybe those threads no longer know where to dump standard output? Thoughts?

2
  • 1
    I'm not a genius of parallel computing, but it's definitely true that socket-type clusters in R don't return outputs (e.g. progress bars, messages, etc) until the job finishes and returns the output. I've never worked with fork-type clusters, so I don't know if that would circumvent this limitation or not. I've been desperate for a progress bar a few times in the past, and there is a work-around when the number of parallel processes is low: write separate, non-parallelized code for each job and run each job by hand in a separate (simultaneous) instance of R. Commented Jul 13, 2017 at 6:45
  • @JacobSocolar Oof, that is desperate ;) I ran this non-inreractively via a PBS and found that my logs had error and warning messages from the shell (part of this processing involves using system() to call other tools) but not message() output in R. So it seems like there's probably it. I suppose another desperate answer is to `system("echo My update")... Commented Jul 13, 2017 at 6:57

1 Answer 1

1

If you want to output from a parallel-foreach loop, just use the option outfile: makeCluster(no_cores, outfile = "/path/to/log_file.txt").

Note the the logs of all workers are written to the same file (in the order in which they arrive).

Sign up to request clarification or add additional context in comments.

3 Comments

Will test when I get back from my run, but this sounds like exactly what I need :)
@ F.Prive: Can you please provide an example as to how you would do this? Thanks!
@stats_noob Adjust the outfile option to the place of your log file.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.