Skip to main content
2 of 8
edited body
Ciro Santilli OurBigBook.com
  • 392.6k
  • 120
  • 1.3k
  • 1.1k

Python documentation quotes

I've highlighted the key Python documentation quotes about Process vs Threads and the GIL at: What is the global interpreter lock (GIL) in CPython?

Process vs thread experiments

I did a bit of benchmarking in order to show the difference more concretely.

In the benchmark, I timed CPU and IO bound work for various numbers of threads on an 8 hyperthread CPU. The work supplied is the same for each number of threads (thus more threads means more total work supplied).

The results were:

enter image description here

Conclusions:

  • for CPU bound work, multiprocessing is always faster, presumably due to the GIL

  • for IO bound work. both are exactly the same speed

  • threads only scale up to about 4x instead of the expected 8x since I'm on an 8 hyperthread machine.

    Contrast that with a C POSIX CPU-bound work which reaches the expected 8x speedup: What do 'real', 'user' and 'sys' mean in the output of time(1)?

    TODO: I don't know the reason for this, there must be other Python inefficiencies coming into play.

Test code:

#!/usr/bin/env python3 import multiprocessing import threading import time import sys def cpu_func(result, niters): ''' A useless CPU bound function. ''' for i in range(niters): result = (result * result * i + 2 * result * i * i + 3) % 10000000 return result class CpuThread(threading.Thread): def __init__(self, niters): super().__init__() self.niters = niters self.result = 1 def run(self): self.result = cpu_func(self.result, self.niters) class CpuProcess(multiprocessing.Process): def __init__(self, niters): super().__init__() self.niters = niters self.result = 1 def run(self): self.result = cpu_func(self.result, self.niters) class IoThread(threading.Thread): def __init__(self, sleep): super().__init__() self.sleep = sleep self.result = self.sleep def run(self): time.sleep(self.sleep) class IoProcess(multiprocessing.Process): def __init__(self, sleep): super().__init__() self.sleep = sleep self.result = self.sleep def run(self): time.sleep(self.sleep) if __name__ == '__main__': cpu_n_iters = int(sys.argv[1]) sleep = 1 cpu_count = multiprocessing.cpu_count() input_params = [ (CpuThread, cpu_n_iters), (CpuProcess, cpu_n_iters), (IoThread, sleep), (IoProcess, sleep), ] header = ['nthreads'] for thread_class, _ in input_params: header.append(thread_class.__name__) print(' '.join(header)) for nthreads in range(1, 2 * cpu_count): results = [nthreads] for thread_class, work_size in input_params: start_time = time.time() threads = [] for i in range(nthreads): thread = thread_class(work_size) threads.append(thread) thread.start() for i, thread in enumerate(threads): thread.join() results.append(time.time() - start_time) print(' '.join('{:.6e}'.format(result) for result in results)) 

GitHub upstream + plotting code on same directory.

Tested on Ubuntu 18.10, Python 3.6.7, in a Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB), SSD: Samsung MZVLB512HAJQ-000L7 (3,000 MB/s).

Ciro Santilli OurBigBook.com
  • 392.6k
  • 120
  • 1.3k
  • 1.1k