I have a tkinter GUI that downloads data from multiple websites at once. I run a seperate thread for each download (about 28). Is that too much threads for one GUI process? because it's really slow, each individual page should take about 1 to 2 seconds but when all are run at once it takes over 40 seconds. Is there any way I can shorten the time it takes to download all the pages? Any help is appreciated, thanks.
3 Answers
It's probably the GIL (global interpreter lock) that gets in your way. Python has some performance problems with many threads.
You could try twisted.web.getPage (see http://twistedmatrix.com/projects/core/documentation/howto/async.html a bit down the page). I don't have benchmarks for that. But taking the example on that page and adding 28 deferreds to see how fast it is will give you a comparable result pretty fast. Keep in mind, that you'd have to use the gtk reactor and get into twisteds programming style, though.
5 Comments
A process can have hundreds of threads on any modern OS without any problem.
If you're bandwidth-limited, 1 to 2 seconds times 28 means 40 seconds is about right. If you're latency limited, it should be faster, but with no information, all I can suggest is:
- add logging to your code to make sure it's actually running in parallel, and that you're not accidentally serializing your threads somehow;
- use a network monitor to make sure that network requests are actually going out in parallel.
It's hard to give anything better without more information.