1

I attempted to speed up my python program using the multiprocessing module but I found it was quite slow. A Toy example is as follows:

import time from multiprocessing import Pool, Manager class A: def __init__(self, i): self.i = i def score(self, x): return self.i - x class B: def __init__(self): self.i_list = list(range(1000)) self.A_list = [] def run_1(self): for i in self.i_list: self.x = i map(self.compute, self.A_list) #map version self.A_list.append(A(i)) def run_2(self): p = Pool() for i in self.i_list: self.x = i p.map(self.compute, self.A_list) #multicore version self.A_list.append(A(i)) def compute(self, some_A): return some_A.score(self.x) if __name__ == "__main__": st = time.time() foo = B() foo.run_1() print("Map: ", time.time()-st) st = time.time() foo = B() foo.run_2() print("MultiCore: ", time.time()-st) 

The outcomes on my computer(Windows 10, Python 3.5) is

Map: 0.0009996891021728516

MultiCore: 19.34994912147522

Similar results can be observed on Linux Machine (CentOS 7, Python 3.6).

I guess it was caused by the pickling/depicking of objects among processes? I tried to use the Manager module but failed to get it to work.

Any help will be appreciated.

2
  • It's not the map that is slow, it seems to be the self.A_list.append(A(i)) Also you seem to use map incorrectly. It returns a value and you are not using it at all. Do you know what map is doing? Commented Jul 19, 2018 at 9:58
  • Thanks for commenting. Both map and append are fast as can be seen from the first timing result. On the contrary, p.map is slow(the multiprocessing version). I did not use the return value of map because this is just an example to show the poor performance of p.map and I did not need the return value. Commented Jul 19, 2018 at 10:30

2 Answers 2

1

Wow that's impressive (and slow!).

Yes, this is because Objects must be accessed concurrently by workers, which is costly.

So I played a little bit and managed to gain a lot of perf by making the compute method static. So basically, you don't need to share the B object instance anymore. Still very slow but better.

import time from multiprocessing import Pool, Manager class A: def __init__(self, i): self.i = i def score(self, x): return self.i - x x=0 def static_compute(some_A): res= some_A.score(x) return res class B: def __init__(self): self.i_list = list(range(1000)) self.A_list = [] def run_1(self): for i in self.i_list: x=i map(self.compute, self.A_list) #map version self.A_list.append(A(i)) def run_2(self): p = Pool(4) for i in self.i_list: x=i p.map(static_compute, self.A_list) #multicore version self.A_list.append(A(i)) 

The other reason that makes it slow, to me, is the fixed cost of using Pool. You're actually launching a Pool.map 1000 times. If there is a fixed cost associated with launching those processes, that would make the overall strategy slow. Maybe you should test that with longer A_list (longer than the i_list, which requires a different algo).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. The static function is much faster. I think, though not compelling in this example, it will be helpful in my real application.
0

The reasoning behind this is:

  1. the map call is performed by main

*meaning when foo.run_1() is called. The main is mapping for itself. much like telling your self what to do.

*when foo_run2() is called the main is mapping for max process capablilites of that pc. If your max process is 6, then the main is mapping for 6 Threads. much like orginizing 6 people to tell you something.

Side Note: if you use:

p.imap(self.compute,self.A_list) 

the items will append in order to A_list

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.