0

So I've started trying my hand at Python's multiprocessing library. My goal was to speed up a slow function that compared a string against a large database of other strings and returned the most similar match. To do so, I attempted to write a function that split the task among different Process objects and set them running, using a shared variable to capture the results:

cores = cpu_count() # Number of cores in this computer, i.e. 4 sublistList = chunks(tasks,cores) # Split tasks into subprocessing arrays of evenly-sized chunks, the number of chunks equal to how many cores we have to process them # Create a multiprocessing function, since this is a large function that will take time and splitting it across cores will ease the load if __name__ == '__main__': freeze_support() # Make sure multiple applications don't spawn, this is necessary for Windows jobs = [] # Array of processes manager = Manager() # Create a manager returns = manager.list() # Shared list variable we use to get return results for i in range(0,cores): # For number of cores... p = Process(target=workerFunction,args=(w,sublistList[i],returns)) jobs.append(p) # Add to array of processes to run p.start() for p in jobs: p.join() 

However, when I run this code, it creates a new application window and then hangs indefinitely, which is completely bizarre behavior and not at all what I want. What could be causing this in my code? Is my worker function silently crashing and not alerting me? I have looked at a variety of other answers but none of the suggested answers seemed to fix this issue.

(If it's relevant to the question, I am an entry-level software engineer with a few years of programming experience in other languages, but am relatively new to Python. This is for a small indie game side project of mine.)

4
  • 2
    There isn't enough code in your question for anyone to answer it. Please edit it and provide a MCVE. See How to create a Minimal, Complete, and Verifiable Example. Commented Dec 6, 2018 at 6:58
  • the join method waits for the subprocess to finish one way or another. It seems your worker is running infinitely. Commented Dec 6, 2018 at 7:19
  • should not be "worker function silently crashing". if the subprocess crash and exit, p.join() will return. try to debug the "worker function" in dead loop or hang. Commented Dec 6, 2018 at 7:21
  • Any task using multiprocess must never pending on any functionality that does not time out. On the other hand, you should also use try: except: to capture error. Depends on requirements, you might need to use asynchronous processing. Commented Dec 6, 2018 at 8:31

1 Answer 1

2

This isn't answer (yet), but I'm posting it to show you an example of a runnable Minimal, Complete, and Verifiable Example.

The code is based on what's currently in your question, plus everything else that's missing to make it runnable. Not surprisingly, since all of those things are merely guesses, it doesn't reproduce the problem you say you're having—but that's likely due to one or more of my guesses being different in some important aspect...which is why you really should be the one providing all the code.

One observation: The p.join() calls at the end will make the main process wait for each subprocess to complete. This will cause the main process to appear to "hang" while waiting upon each one.

from multiprocessing import * from time import sleep tasks = None def chunks(tasks, cores): return [[i for _ in range(8)] for i in range(cores)] def workerFunction(w, sublist, returns): print('starting workerFunction:', w) result = [value+100 for value in sublist] returns.append(result) sleep(3) print('exiting workerFunction:', w) if __name__ == '__main__': # Only do in main process. freeze_support() cores = cpu_count() sublistList = chunks(tasks, cores) manager = Manager() returns = manager.list() jobs = [] for i in range(cores): w = i p = Process(target=workerFunction, args=(w, sublistList[i], returns)) jobs.append(p) p.start() for i, p in enumerate(jobs, 1): print('joining job[{}]'.format(i)) p.join() # Display results. for sublist in returns: print(sublist) print('done') 

Output:

joining job[1] starting workerFunction: 2 starting workerFunction: 1 starting workerFunction: 0 starting workerFunction: 5 starting workerFunction: 7 starting workerFunction: 3 starting workerFunction: 4 starting workerFunction: 6 exiting workerFunction: 2 exiting workerFunction: 0 exiting workerFunction: 1 joining job[2] exiting workerFunction: 5 joining job[3] joining job[4] exiting workerFunction: 7 exiting workerFunction: 3 exiting workerFunction: 4 joining job[5] exiting workerFunction: 6 joining job[6] joining job[7] joining job[8] [102, 102, 102, 102, 102, 102, 102, 102] [101, 101, 101, 101, 101, 101, 101, 101] [100, 100, 100, 100, 100, 100, 100, 100] [105, 105, 105, 105, 105, 105, 105, 105] [107, 107, 107, 107, 107, 107, 107, 107] [103, 103, 103, 103, 103, 103, 103, 103] [104, 104, 104, 104, 104, 104, 104, 104] [106, 106, 106, 106, 106, 106, 106, 106] done Press any key to continue . . . 
Sign up to request clarification or add additional context in comments.

1 Comment

It's been a few years and I've learned how to ask better questions on other parts of SE, so I'm circling back to tie up loose ends on old questions; thanks for the poke here about how to ask better code questions, it did help. Checkmarking this as helpful, even if it's been too long for me to verify that it solved the problem. Thanks :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.