2

I have seen other related questions (like this one) but none of them actually answers my questions, so here it goes:

I have an obviously embarassingly parallel task to perform, my own rolled version of GridSearch. In simple words I have a set of parameters and want to evaluate my model on each of those. There is no dependence between those runs so the code looks like that:

pool = multiprocessing.Pool(processes=4) scores = pool.map(evaluator, permutations) 

where the evaluator is a function that computes a score given a dict of parameters, and permutations is a list of such dictionaries (of length 4 in this case).

Now my assumption is that using 4 processes (on an 8 core machine) should give me a 4x speedup (note that the evaluator takes the same amount of time regardless of the set of parameters so the load is perfectly balanced).

Instead my timing has yielded those results:

  1. Using 4 processes, each evaluation takes 82 sec to complete, as a result the total time is 84 sec.

  2. Using 1 process, each evaluation takes 43 sec to complete, as a result the total time is 170 sec.

So in the end I get a 2x speedup using 4 cores. Why is each process faster when there are fewer processes?

7
  • Try using concurrent.futuresusing the pattern described here stackoverflow.com/questions/48492459/… Commented Jan 29, 2018 at 14:52
  • I am more interested in why my current approach does not work that how I can improve it. I will check the link anyways, thanks :) Commented Jan 29, 2018 at 14:57
  • Parallel processing is not magic, it incurs in some overhead to distribute the load in the different processes and the processes are still handled by the OS, so other processes can interfeer. Try running more complex/simpler tasks and you will see how this effect is substantially reduced when the complexity increases. Commented Jan 29, 2018 at 14:58
  • 1
    It could be a number of factors, your CPU, your server, your disk space and more. In some cases multiprocessing is not appropriate, and threads are the way to go. Welcome. Commented Jan 29, 2018 at 14:59
  • You can try to see how much of the time the CPU cores are actually loaded. Maybe permutations take a lot of time to pass to the processes. Commented Jan 29, 2018 at 15:03

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.