0

I am designing a multithreaded routine in python where one function adds a function to a queue and several helper threads concurrentlt pop things off the queue until it's empty. In theory this should be much faster than a single threaded implementation but in both the real application and the toy example I designed for this question it's not the case. I'm guessing the reason is some synchronization issue with the Queue object (which is threadsafe according to python documentation) but that's only a guess. Any help with general optimizations is appreciated!

And the code for the toy example:

from Queue import Queue import time queue = Queue(maxsize=0) counter = [] #called by an external main method, adds and removes from global queue in single thread def single_thread(): fillit() time_origin = "%.10f" % time.time() while not queue.empty(): queue.get() time_end = "%.10f" % time.time() time_t = float(time_end) - float(time_origin) print "time of single threaded implementation: " + str(time_t) +"\n" #called by an external main method; adds to queue and removes from queue in multiple threads def multi_thread(): fillit() time_origin = "%.10f" % time.time() spawn_threads(4) time_end = "%.10f" % time.time() time_t = float(time_end) - float(time_origin) print "time of multi threaded implementation: " + str(time_t) +"\n" #Fills up the queue with 2^19 elements def fillit(): for i in range(2 ** 19): queue.put(i) #Spawns n helper threads to help empty the queue def spawn_threads(num_threads): for i in range(num_threads): counter.append(0) thread = myThread(i, "Thread-" + str(i)) thread.setDaemon(True) thread.start() while not queue.empty(): continue print "done with threads " + str(counter) +" elements removed!" #THREADING SUPPORT CODE import threading class myThread (threading.Thread): def __init__(self, threadID, name): threading.Thread.__init__(self) self.threadID = threadID self.name = name def run(self): #each thread continues to empty the queue while it still has elements while not queue.empty(): queue.get() global counter counter[self.threadID] +=1 

Results below:

time of single threaded implementation: 1.51300001144 done with threads [131077, 131070, 131071, 131070] elements removed! time of multi threaded implementation: 7.77100014687 
2
  • Why are you casting the start and end time to strings, just to cast them back to floats to calculate the time elapsed? Commented Nov 23, 2016 at 17:00
  • @BrendanAbel What you say is valid for casting in the start times. That's certainly not related to the slowdown I'm experiencing though. Commented Nov 23, 2016 at 17:06

1 Answer 1

0

What you are seeing here is the python GIL in action. The GIL ensures only 1 thread per process executing python code, which makes multithreading tasks that are not IO-bound slower, instead of faster, see this link. If you want to do real concurrency, take a look at multiprocessing instead of multithreading. Be aware though that multiprocessing has more overhead, because of the lack of shared state and spinning up processes is more costly.

Sign up to request clarification or add additional context in comments.

2 Comments

Pycharm (the ide I'm using) compiles with the --multiproc flag. Does this affect threading across multiple processes or is the GIL still a limiting factor? What module/platform would you suggest for getting the behavior I want?
Python is just not capable of true multithreading, so there is no platform on which you will get the desired behavior

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.