31

I've written a working program in Python that basically parses a batch of binary files, extracting data into a data structure. Each file takes around a second to parse, which translates to hours for thousands of files. I've successfully implemented a threaded version of the batch parsing method with an adjustable number of threads. I tested the method on 100 files with a varying number of threads, timing each run. Here are the results (0 threads refers to my original, pre-threading code, 1 threads to the new version run with a single thread spawned).

0 threads: 83.842 seconds 1 threads: 78.777 seconds 2 threads: 105.032 seconds 3 threads: 109.965 seconds 4 threads: 108.956 seconds 5 threads: 109.646 seconds 6 threads: 109.520 seconds 7 threads: 110.457 seconds 8 threads: 111.658 seconds 

Though spawning a thread confers a small performance increase over having the main thread do all the work, increasing the number of threads actually decreases performance. I would have expected to see performance increases, at least up to four threads (one for each of my machine's cores). I know threads have associated overhead, but I didn't think this would matter so much with single-digit numbers of threads.

I've heard of the "global interpreter lock", but as I move up to four threads I do see the corresponding number of cores at work--with two threads two cores show activity during parsing, and so on.

I also tested some different versions of the parsing code to see if my program is IO bound. It doesn't seem to be; just reading in the file takes a relatively small proportion of time; processing the file is almost all of it. If I don't do the IO and process an already-read version of a file, I adding a second thread damages performance and a third thread improves it slightly. I'm just wondering why I can't take advantage of my computer's multiple cores to speed things up. Please post any questions or ways I could clarify.

9
  • 4
    The GIL is probably at fault here. You may look into the multiprocessing module, as an alternative to the threading module, as it achieves true concurrency where the GIL will prevent it for threading. Commented Jul 25, 2011 at 19:50
  • 2
    Have a look at this. You've encountered the only thing I hate about Python (well, CPython anyways). Commented Jul 25, 2011 at 19:50
  • 2
    Multiple cores will show activity, but it's just switching between them - only one Python thread can run at a time. You need multiprocessing: docs.python.org/dev/library/multiprocessing Commented Jul 25, 2011 at 19:51
  • 1
    Your program could actually show an improvement in speed if it were I/O-bound, as I/O is one time when CPython lets other threads run. Commented Jul 25, 2011 at 19:53
  • 2
    I'll look into using multiprocessing; I'm running Python 2.4 so I'll need to upgrade first which was why threading interested me. I thought multiprocessing was just a higher-level shell around threading/thread. What's the point of threading, then? And I'm still not sure I understand why multiple threads would slow down my program--is that just the thread overhead? Commented Jul 25, 2011 at 20:27

2 Answers 2

46

This is sadly how things are in CPython, mainly due to the Global Interpreter Lock (GIL). Python code that's CPU-bound simply doesn't scale across threads (I/O-bound code, on the other hand, might scale to some extent).

There is a highly informative presentation by David Beazley where he discusses some of the issues surrounding the GIL. The video can be found here (thanks @Ikke!)

My recommendation would be to use the multiprocessing module instead of multiple threads.

Sign up to request clarification or add additional context in comments.

3 Comments

Here is the video of that presentation.
The multiprocessing module worked perfectly. Once I got it working I saw exactly the kind of speedup I'd been expecting. Thanks.
This comment does not apply to the case for python sharing cpu bound code with c++ code. Code and explanation: github.com/PaddlePaddle/Paddle/pull/1364#discussion_r101898833
8

The threading library does not actually utilize multiple cores simultaneously for computation. You should use the multiprocessing library instead for computational threading.

5 Comments

That first statement is incorrect. It does use multiple cores. Only one at the time can get the GIL.
Ah, I was missing a word. Fixed.
You miss the point. It is not the threading library itself that prevents it. It uses the pthread library, which can use all cores. This would implicate that one can fix the threading library and the problem is solved. But the problem is much deeper than that.
His statement however is correct -- he doesn't say it couldn't use multiple cores, he said it doesn't.
@Ikke: He refers to the Python threading library, not the underlying implementation (about which we need not know anything, and which is not necessarily using the POSIX threading API -- it certainly doesn't use it on Windows!).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.