Original Question
I am trying to use multiprocessing Pool in Python. This is my code:
def f(x): return x def foo(): p = multiprocessing.Pool() mapper = p.imap_unordered for x in xrange(1, 11): res = list(mapper(f,bar(x))) This code makes use of all CPUs (I have 8 CPUs) when the xrange is small like xrange(1, 6). However, when I increase the range to xrange(1, 10). I observe that only 1 CPU is running at 100% while the rest are just idling. What could be the reason? Is it because, when I increase the range, the OS shutdowns the CPUs due to overheating?
How can I resolve this problem?
minimal, complete, verifiable example
To replicate my problem, I have created this example: Its a simple ngram generation from a string problem.
#!/usr/bin/python import time import itertools import threading import multiprocessing import random def f(x): return x def ngrams(input_tmp, n): input = input_tmp.split() if n > len(input): n = len(input) output = [] for i in range(len(input)-n+1): output.append(input[i:i+n]) return output def foo(): p = multiprocessing.Pool() mapper = p.imap_unordered num = 100000000 #100 rand_list = random.sample(xrange(100000000), num) rand_str = ' '.join(str(i) for i in rand_list) for n in xrange(1, 100): res = list(mapper(f, ngrams(rand_str, n))) if __name__ == '__main__': start = time.time() foo() print 'Total time taken: '+str(time.time() - start) When num is small (e.g., num = 10000), I find that all 8 CPUs are utilised. However, when num is substantially large (e.g.,num = 100000000). Only 2 CPUs are used and rest are idling. This is my problem.
Caution: When num is too large it may crash your system/VM.
barreturn?barreturns anigraphobject andxdetermines the depth of the graph. Will that have any effect on this?igraphis big enough, it may be spending more time pickling the parameters and results and pushing them over the pipes than doing actual work (especially if your actual work is justreturn x!), which would definitely serialize everything to one CPU. And it's certainly plausible that a graph of depth 5 wouldn't have this problem, but a graph of depth 9 would.multiprocessingoverhead, and 0% of your time doing actual work. In 2.7, the overhead is serialized, so everything happens on one core. In 3.4, the overhead is parallelized, so your other cores have a bit of work to do, but it's still just overhead (useless work).