1

My question is related to parallelizing a python code and I want to know how we can run a function for different instances of a class to decrease the runtime.

What I have: I have multiple instances of a class A (stored in a list called instances). This class has a function add. Now, we have multiple independent tasks, one for each instance of class A where the input to all these tasks is one thing (number n in my example). Each instance needs to apply function add to n and return a number. We want to store the returned numbers of all instances in a list (list results in my example).

What I want: As you can see, in this example, the tasks can be parallelized as there is no need for one to wait for the other one to gets done. How can we parallelize the simple code below? Since nothing is shared between the different instances, I guess we can even use multithreading, right? Or the only way is to use multiprocessing?

class A(object): def __init__(self, q): self.p = q def add(self, num): return self.p + num instances = [] for i in xrange(5): instances.append(A(i)) n = 20 results = [] for inst in instances: results.append(inst.add(n)) print(results) 

Output: [20, 21, 22, 23, 24]

4
  • For problems this simple it doesn't make any sense at all to use multiple threads. If you're having a specific problem, please describe it. This rather broad question with toy code is not really useful. Commented Jul 17, 2018 at 20:34
  • @moooeeeep I have a very complicated task (for each instance of the class to accomplish) but if I know how to use multithreading for this simple example, I should be able to use it for my task. Commented Jul 17, 2018 at 20:37
  • Related: stackoverflow.com/q/2846653/1025391 and stackoverflow.com/q/1294382/1025391 and stackoverflow.com/q/9038711/1025391 Commented Jul 17, 2018 at 20:37
  • @moooeeeep I have already seen the first link, thank though Commented Jul 17, 2018 at 20:40

1 Answer 1

2

The pattern that your toy code seems to follow would suggest to map a wrapper function to the list using a thread pool / process pool. The number of instances and the basic arithmetic operation that you want to apply for each instance however suggests that the overhead for parallelizing this would outweigh any potential benefit.

Whether it makes sense to do this, depends on the number of instances and the time it takes to run each of those member functions. So make sure to do at least some basic profiling of your code before you try to parallelize this. Find out whether the tasks you attempt to parallelize is CPU-bound or IO-bound.

Here's an example that should demonstrate the basic pattern:

# use multiprocessing.Pool for a processes-based worker pool # use multiprocessing.dummy.Pool for a thread-based worker pool from multiprocessing.dummy import Pool # make up list of instances l = [list() for i in range(5)] # function that calls the method on each instance def foo(x): x.append(20) return x # actually call functions and retrieve list of results p = Pool(3) results = p.map(foo, l) print(results) 

Obviously you need to fill the blanks to adapt this to your real code.

For further reading:

Also maybe have a look at futures:

If you really want to have this parallel, also consider to port your calculations to a GPU (you might need to move away from Python then).

Sign up to request clarification or add additional context in comments.

11 Comments

Can you write your answer based on my example? This example is not helping me as I don't have one function that I can apply to different tasks, instead I have one similar function inside different instances of a class.
@user491626 Of course, you need to come up with a function that does what you want specifically. Just replace my .append(20) with .add(20) to fit your example. This is the pattern, which I think you need. You just need to fill in the blanks.
It is not just that. You function takes l as input. I mean you have modified the function. I don't wan to change my class (if it is possible)
@user491626 Then it seems I don't know what you are actually asking about.
Can you take a look the updated question? I guess I can use multithreading. Can you also add how we can use multithreading in your answer?
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.