The code:
import multiprocessing print(f'num cpus {multiprocessing.cpu_count():d}') import sys; print(f'Python {sys.version} on {sys.platform}') def _process(m): print(m) #; return m raise ValueError(m) args_list = [[i] for i in range(1, 20)] if __name__ == '__main__': with multiprocessing.Pool(2) as p: print([r for r in p.starmap(_process, args_list)]) prints:
num cpus 8 Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 03:13:28) [Clang 6.0 (clang-600.0.57)] on darwin 1 7 4 10 13 16 19 multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 47, in starmapstar return list(itertools.starmap(args[0], args[1])) File "/Users/ubik-mac13/Library/Preferences/PyCharm2018.3/scratches/multiprocess_error.py", line 8, in _process raise ValueError(m) ValueError: 1 """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/ubik-mac13/Library/Preferences/PyCharm2018.3/scratches/multiprocess_error.py", line 18, in <module> print([r for r in p.starmap(_process, args_list)]) File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 298, in starmap return self._map_async(func, iterable, starmapstar, chunksize).get() File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 683, in get raise self._value ValueError: 1 Process finished with exit code 1 Increasing the number of processes in the pool to 3 or 4 prints all the odd numbers (possibly out of order):
1 3 5 9 11 7 13 15 17 19 while from 5 and above it prints all the range 1-19. So what happens here? Do the processes crash after a number of failures?
This is a toy example of course but it comes from a real code issue I had - having left a multiprocessing pool run for some days steadily the cpu use went down as if some processes were killed (note the cpu utilization going downhill on 03/04 and 03/06 while there was still lots of tasks to be run):
When the code terminated it presented me with one (and one only as here, while the processes were many more) multiprocessing.pool.RemoteTraceback - bonus question is this traceback random? In this toy example, it is usually ValueError: 1 but sometimes also other numbers appear. Does multiprocessing keep the first traceback from the first process that crashes?
