I was reading a article on Python multi threading using Queues and have a basic question.
Based on the print stmt, 5 threads are started as expected. So, how does the queue works?
1.The thread is started initially and when the queue is populated with a item does it gets restarted and starts processing that item? 2.If we use the queue system and threads process each item by item in the queue, how there is a improvement in performance..Is it not similar to serial processing ie; 1 by 1.
import Queue import threading import urllib2 import datetime import time hosts = ["http://yahoo.com", "http://google.com", "http://amazon.com", "http://ibm.com", "http://apple.com"] queue = Queue.Queue() class ThreadUrl(threading.Thread): def __init__(self, queue): threading.Thread.__init__(self) print 'threads are created' self.queue = queue def run(self): while True: #grabs host from queue print 'thread startting to run' now = datetime.datetime.now() host = self.queue.get() #grabs urls of hosts and prints first 1024 bytes of page url = urllib2.urlopen(host) print 'host=%s ,threadname=%s' % (host,self.getName()) print url.read(20) #signals to queue job is done self.queue.task_done() start = time.time() if __name__ == '__main__': #spawn a pool of threads, and pass them queue instance print 'program start' for i in range(5): t = ThreadUrl(queue) t.setDaemon(True) t.start() #populate queue with data for host in hosts: queue.put(host) #wait on the queue until everything has been processed queue.join() print "Elapsed Time: %s" % (time.time() - start)