Skip to main content
1 of 3
Roman
  • 2.4k
  • 5
  • 32
  • 55

I/O slowdown with multithreading in python

I have a python script, which works on the following scheme: read a large file (e.g., movie) - compose selected information from it into a number of small temporary files - spawn in subprocesses a C++ application to perform the files processing/calculations (separately for each file) - read the application output. To speed up the script I used multiprocessing. However, it has major drawback: each process has to maintain in RAM the whole copy of the large input file, and therefore I can run only few processes, as I run out of memory. Thus I decided to try multithreading instead (or some combination of multiprocessing and multithreading) due to the fact that threads share the address space. As the python part most of the time works with file I/O or waits for the C++ application to complete, I thought that GIL must not be an issue here. Nevertheless, instead of some gain in performance I observe drastic slowdown, mainly owing to the I/O part.

I illustrate the problem with the following code (saved as test.py):

import sys, threading, tempfile, time nthreads = int(sys.argv[1]) class IOThread (threading.Thread): def __init__(self, thread_id, obj): threading.Thread.__init__(self) self.thread_id = thread_id self.obj = obj def run(self): run_io(self.thread_id, self.obj) def gen_object(nlines): obj = [] for i in range(nlines): obj.append(str(i) + '\n') return obj def run_io(thread_id, obj): ntasks = 100 // nthreads + (1 if thread_id < 100 % nthreads else 0) for i in range(ntasks): tmpfile = tempfile.NamedTemporaryFile('w+') with open(tmpfile.name, 'w') as ofile: for elem in obj: ofile.write(elem) with open(tmpfile.name, 'r') as ifile: content = ifile.readlines() tmpfile.close() obj = gen_object(100000) starttime = time.time() threads = [] for thread_id in range(nthreads): threads.append(IOThread(thread_id, obj)) threads[thread_id].start() for thread in threads: thread.join() runtime = time.time() - starttime print('Runtime: {:.2f} s'.format(runtime)) 

When I run it with different number of threads, I get this:

$ python3 test.py 1 Runtime: 2.84 s $ python3 test.py 1 Runtime: 2.77 s $ python3 test.py 1 Runtime: 3.34 s $ python3 test.py 2 Runtime: 6.54 s $ python3 test.py 2 Runtime: 6.76 s $ python3 test.py 2 Runtime: 6.33 s 

Can someone explain me the result, as well as give some advice, how to effectively parallelize I/O using multithreading?

Roman
  • 2.4k
  • 5
  • 32
  • 55