Python: Why is threaded function slower than non thread

Question

Hello I'm trying to calculate the first 10000 prime numbers.

I'm doing this first non threaded and then splitting the calculation in 1 to 5000 and 5001 to 10000. I expected that the use of threads makes it significant faster but the output is like this:

 --------Results-------- Non threaded Duration: 0.012244000000000005 seconds Threaded Duration: 0.012839000000000017 seconds

There is in fact no big difference except that the threaded function is even a bit slower.

What is wrong?

This is my code:

import math from threading import Thread def nonThreaded(): primeNtoM(1,10000) def threaded(): t1 = Thread(target=primeNtoM, args=(1,5000)) t2 = Thread(target=primeNtoM, args=(5001,10000)) t1.start() t2.start() t1.join() t2.join() def is_prime(n): if n % 2 == 0 and n > 2: return False for i in range(3, int(math.sqrt(n)) + 1, 2): if n % i == 0: return False return True def primeNtoM(n,m): L = list() if (n > m): print("n should be smaller than m") return for i in range(n,m): if(is_prime(i)): L.append(i) if __name__ == '__main__': import time print("--------Nonthreaded calculation--------") nTstart_time = time.clock() nonThreaded() nonThreadedTime = time.clock() - nTstart_time print("--------Threaded calculation--------") Tstart_time = time.clock() threaded() threadedTime = time.clock() - Tstart_time print("--------Results--------") print ("Non threaded Duration: ",nonThreadedTime, "seconds") print ("Threaded Duration: ",threadedTime, "seconds")

Daniel Roseman · Accepted Answer · 2014-04-28 15:33:48Z

from: https://wiki.python.org/moin/GlobalInterpreterLock

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)

This means: since this is CPU-intensive, and python is not threadsafe, it does not allow you to run multiple bytecodes at once in the same process. So, your threads alternate each other, and the switching overhead is what you get as extra time.

Honestly I don't know, but I read there's a multiprocessing module. It produces forks and, in unix environments, process forks are just a bit slower than thread forks (context switchs are quicker and lighter). However this would be an issue again in Windows

bgporter · Accepted Answer · 2014-04-28 15:55:54Z

You can use the multiprocessing module, which gives results like below:

('Non threaded Duration: ', 0.016599999999999997, 'seconds') ('Threaded Duration: ', 0.007172000000000005, 'seconds')

...after making just these changes to your code (changing 'Thread' to 'Process'):

import math #from threading import Thread from multiprocessing import Process def nonThreaded(): primeNtoM(1,10000) def threaded(): #t1 = Thread(target=primeNtoM, args=(1,5000)) #t2 = Thread(target=primeNtoM, args=(5001,10000)) t1 = Process(target=primeNtoM, args=(1,5000)) t2 = Process(target=primeNtoM, args=(5001,10000)) t1.start() t2.start() t1.join() t2.join()

By spawning actual OS processes instead of using in-process threading, you eliminate the GIL issues discussed in @Luis Masuelli's answer.

multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.

But then I'm creating a new process instead of a thread or am I wrong?
Exactly -- by creating multiple processes you can e.g. run your code on multiple cores and not be affected by the global interpreter lock.
remember that, in unix environments, doing a context-switch between processes is not prohibitively harder than doing a thread-switch in the same process. PCBs are light-weight in unix (but not in windows, so it would be again an issue in windows)
Ok but in fact that's not the solution I try to achieve. So which kind of code would be suitable for a thread?
@AzzUrr1 You simply won't be able to get a satisfactory solution using threads in Python. You have to use multiple processes.

Collectives™ on Stack Overflow

Python: Why is threaded function slower than non thread

2 Answers 2

2 Comments

6 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

6 Comments

Linked

Related