4

I wrote the following code

from multiprocessing import Pool class A: def __init__(self) -> None: self.a = 1 def incr(self,b): self.a += 1 print(self.a) def execute(self): with Pool(2) as p: p.starmap(self.incr, [[1],[1],[1],[1]]) a = A() a.execute() print(a.a) 

The output is 2 2 2 2 1. I want to understand what exactly happens in this scenario. Does the pool create four copies of self? If so how is this copying done?

4
  • 1
    multiprocessing creates more than just a copy of an object, it effectively copies everything in your program, but how it does that exactly depends on the way the process is started and the operating system you're running on. See this question for example Commented Jan 4, 2022 at 3:35
  • It creates multiple python processes. Imagine you ran python myscript.py multiple times (the exact details can vary based on whether the processes use "spawn" or "fork") Commented Jan 4, 2022 at 5:05
  • with Pool(2) as p ?? why 2 what happen if you use 4 ? Commented Jan 4, 2022 at 9:49
  • here: medium.datadriveninvestor.com/… Commented Jan 4, 2022 at 11:10

2 Answers 2

3

In this scenario, the file is run linearly until a.execute() is called and reaches the call to starmap. What multiprocessing does is to create two more processes. The way this is done depends is limited by the OS and can (in some cases be selected using multiprocessing.set_start_method. The main differences between the methods is that with 'fork' methods the new process are either created copying the current process (it is more complicated, but that's the idea), while in the 'spawn' methods a new python interpreter is started, the file is re-executed, and the call is performed. In this later case, as the file is re-executed, it is very important to use the if __name__ == "__main__": guard.

As each process has its own copy of a, what happens in one process does not affect the other copies of a: even if all copies of have a self.a value of 2, the original a is unchanged. You can test the different methods for starting process with this piece of code (delays added for clarity):

from multiprocessing import Pool, set_start_method, get_start_method import os from time import sleep globalflag = "Flag" class A: def __init__(self) -> None: self.a = 1 def incr(self, b): sleep(b/10) print("Process id:", os.getpid(), " incrementing") print("Flag = ", globalflag) self.a += 1 print(self.a) def execute(self): with Pool() as p: p.starmap(self.incr, [[1],[2],[3],[4]]) print("Running module:", __name__) if __name__ == "__main__": method = 'fork'#'spawn' # 'fork' 'forkserver' if get_start_method(allow_none=True) is None: print("Setting start method to ", method) set_start_method(method) else: print('Start method already set: ', get_start_method()) a = A() globalflag = "Flog" a.execute() print(a.a) 

output with method='fork':

Running module: __main__ Start method already set: fork Process id: 6687 incrementing Flag = Flog 2 Process id: 6688 incrementing Flag = Flog 2 Process id: 6689 incrementing Flag = Flog 2 Process id: 6690 incrementing Flag = Flog 2 1 

output with 'spawn':

Running module: __main__ Setting start method to spawn Running module: __mp_main__ Running module: __mp_main__ Running module: __mp_main__ Running module: __mp_main__ Running module: __mp_main__ Process id: 7096 incrementing Flag = Flag 2 Running module: __mp_main__ Process id: 7097 incrementing Flag = Flag 2 Running module: __mp_main__ Process id: 7098 incrementing Flag = Flag 2 Running module: __mp_main__ Process id: 7100 incrementing Flag = Flag 2 1 
Sign up to request clarification or add additional context in comments.

Comments

-2

try running this changing with Pool(2) as p with:

with Pool(1) as p , with Pool(2) as p , with Pool(4) as p , with Pool(8) as p

and have a look at the different outputs

 from multiprocessing import Pool import os print('PID ;',os.getpid()) ## get the current process id or PID class A: def __init__(self) -> None: self.a = 1 def incr(self,b): self.a += 1 print(self.a) print('PID ;',os.getpid()) def execute(self): with Pool(2) as p: p.starmap(self.incr, [[1],[1],[1],[1]]) a = A() a.execute() print(a.a) print('PID ;',os.getpid()) 

remember:

os.getpid() :

Return the current process id. 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.