1

To improve my code which has one heavy loop I need a speed up. How can I implement multiprocessing for a code like this? (a is typical of size 2 and l up to 10)

for x1 in range(a**l): for x2 in range(a**l): for x3 in range(a**l): output[x1,x2,x3] = HeavyComputationThatIsThreadSafe1(x1,x2,x3) 
2
  • ShadowRanger's comment on your other question still stands - all the threads in the world are not going to make much of a dent if you're committed to calling HeavyComputationThatIsThreadSafe1 over a billion times. How many seconds does a single call to HeavyComputationThatIsThreadSafe1 take? Take that number, multiply it by 1073741824 and divide by the number of cores you have. That gives you the absolute best-case scenario runtime you could achieve with multiprocessing. Commented May 8, 2016 at 1:24
  • I addressed the performance problems with the HeavyComputationThatiIsThreadSafe in the original question you linked to. Even with the data size you mention, it only takes ~8GB of memory and 45s to go over all three nested loops, if you take some reasonable optimization sets. Commented May 8, 2016 at 13:58

1 Answer 1

3

If the HeavyComputationThatIsThreadSafe1 function only uses arrays and not python objects, I would using a concurrent futures (or the python2 backport) ThreadPoolExecutor along with Numba (or cython) with the GIL released. Otherwise use a ProcessPoolExecutor.

See:

http://numba.pydata.org/numba-doc/latest/user/examples.html#multi-threading

You'd want to parallelize the calculation at the level of the outermost loop and and then fill output from the chunks resulting from each thread/process. This assumes the cost of doing so is much cheaper than the computation, which should be the case.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.