Python multiprocessing utilizes only one core

Question

I'm trying out a code snippet from the standard python documentation to learn how to use the multiprocessing module. The code is pasted at the end of this message. I'm using Python 2.7.1 on Ubuntu 11.04 on a quad core machine (which according to the system monitor gives me eight cores due to hyper threading)

Problem: All workload seems to be scheduled to just one core, which gets close to 100% utilization, despite the fact that several processes are started. Occasionally all workload migrates to another core but the workload is never distributed among them.

Any ideas why this is so?

Best regards,

Paul

# # Simple example which uses a pool of workers to carry out some tasks. # # Notice that the results will probably not come out of the output # queue in the same in the same order as the corresponding tasks were # put on the input queue. If it is important to get the results back # in the original order then consider using `Pool.map()` or # `Pool.imap()` (which will save on the amount of code needed anyway). # # Copyright (c) 2006-2008, R Oudkerk # All rights reserved. # import time import random from multiprocessing import Process, Queue, current_process, freeze_support # # Function run by worker processes # def worker(input, output): for func, args in iter(input.get, 'STOP'): result = calculate(func, args) output.put(result) # # Function used to calculate result # def calculate(func, args): result = func(*args) return '%s says that %s%s = %s' % \ (current_process().name, func.__name__, args, result) # # Functions referenced by tasks # def mul(a, b): time.sleep(0.5*random.random()) return a * b def plus(a, b): time.sleep(0.5*random.random()) return a + b def test(): NUMBER_OF_PROCESSES = 4 TASKS1 = [(mul, (i, 7)) for i in range(500)] TASKS2 = [(plus, (i, 8)) for i in range(250)] # Create queues task_queue = Queue() done_queue = Queue() # Submit tasks for task in TASKS1: task_queue.put(task) # Start worker processes for i in range(NUMBER_OF_PROCESSES): Process(target=worker, args=(task_queue, done_queue)).start() # Get and print results print 'Unordered results:' for i in range(len(TASKS1)): print '\t', done_queue.get() # Add more tasks using `put()` for task in TASKS2: task_queue.put(task) # Get and print some more results for i in range(len(TASKS2)): print '\t', done_queue.get() # Tell child processes to stop for i in range(NUMBER_OF_PROCESSES): task_queue.put('STOP') test()

This post might be of assistance to you, stackoverflow.com/questions/5784389/… — Devraj
– Devraj, Commented Aug 1, 2011 at 22:34
Copy and Pasted your code, Maxed out a Intel Pentium D 3.4GHZ in two/two proc screens. — TelsaBoil
– TelsaBoil, Commented Aug 1, 2011 at 22:41
With those sleeps() in there, this will not generate much CPU load at all. — nos
– nos, Commented Aug 1, 2011 at 23:06
True. I've removed the sleeps and increased the number of jobs to 10000, but the workload is still never distributed among the cores. If I start 4 processes I get three sleeping processes and one fully utilized. Thanks for helping out guys, but I couldn't get much out of Devrajs link. I understand that starting up 4 processes is no guarantee that they will be divided among the cores, but the cause of this polarized behavior that all processes except one is sleeping is not clear to me. — Paul
– Paul, Commented Aug 1, 2011 at 23:23
Still, with 10000 jobs, isn't the work done here gone pretty quickly ? When I do some measuring on this program (be sure to remove all your loops that print something, you do not want to measure printing to the screen...), this program spends an awful lot of time doing the list comprehension, and stuffing things on the task_queue. That takes 100% CPU of one processor for a while. When you actually start the workers, things start to use the other processors But with only 10000 itmes, that'll be done on just a second or two, try 200000. — nos
– nos, Commented Aug 2, 2011 at 8:56

PierreBdR · Accepted Answer · 2014-08-11 12:59:22Z

Try replacing the time.sleep with something that actually requires CPUs and you will see the multiprocess works just fine! For example:

def mul(a, b): for i in xrange(100000): j = i**2 return a * b def plus(a, b): for i in xrange(100000): j = i**2 return a + b

iampat · Accepted Answer · 2013-10-08 02:41:02Z

Some how the CPU affinity has been changed. I had this problem with numpy before. I found the solution here http://bugs.python.org/issue17038#msg180663

Community · Accepted Answer · 2017-05-23 10:31:20Z

multiprocessing does not mean you'll use all cores of a processor, you just get multiple processes and not multi-core processes, this would be handled by the OS and is uncertain, the question @Devraj posted on comments has answers to accomplish what you desire.

Gokool · Accepted Answer · 2014-01-04 07:01:08Z

I have found a work around using Parallel Python. I know this is not the solution using basic Python libraries, but the code is simple and works like a charm

I found some good examples of how to set up a SMP machine here parallelpython.com/content/view/17/31

Collectives™ on Stack Overflow

Python multiprocessing utilizes only one core

4 Answers 4

Comments

Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

1 Comment

Linked

Related