Multi-threaded Python application slower than single-threaded implementation

Question

I wrote this program to properly learn how to use multi-threading. I want to implement something similar to this in my own program:

import numpy as np import time import os import math import random from threading import Thread def powExp(x, r): for c in range(x.shape[1]): x[r][c] = math.pow(100, x[r][c]) def main(): print() rows = 100 cols = 100 x = np.random.random((rows, cols)) y = x.copy() start = time.time() threads = [] for r in range(x.shape[0]): t = Thread(target = powExp, args = (x, r)) threads.append(t) t.start() for t in threads: t.join() end = time.time() print("Multithreaded calculation took {n} seconds!".format(n = end - start)) start = time.time() for r in range(y.shape[0]): for c in range(y.shape[1]): y[r][c] = math.pow(100, y[r][c]) end = time.time() print("Singlethreaded calculation took {n} seconds!".format(n = end - start)) print() randRow = random.randint(0, rows - 1) randCol = random.randint(0, cols - 1) print("Checking random indices in x and y:") print("x[{rR}][{rC}]: = {n}".format(rR = randRow, rC = randCol, n = x[randRow][randCol])) print("y[{rR}][{rC}]: = {n}".format(rR = randRow, rC = randCol, n = y[randRow][randCol])) print() for r in range(x.shape[0]): for c in range(x.shape[1]): if(x[r][c] != y[r][c]): print("ERROR NO WORK WAS DONE") print("x[{r}][{c}]: {n} == y[{r}][{c}]: {ny}".format( r = r, c = c, n = x[r][c], ny = y[r][c] )) quit() assert(np.array_equal(x, y)) if __name__ == main(): main()

As you can see from the code the goal here is to parallelize the operation math.pow(100, x[r][c]) by creating a thread for every column. However this code is extremely slow, a lot slower than single-threaded versions.

Output:

Multithreaded calculation took 0.026447772979736328 seconds! Singlethreaded calculation took 0.006798267364501953 seconds! Checking random indices in x and y: x[58][58]: = 9.792315687115973 y[58][58]: = 9.792315687115973

I searched through stackoverflow and found some info about the GIL forcing python bytecode to be executed on a single core only. However I'm not sure that this is in fact what is limiting my parallelization. I tried rearranging the parallelized for-loop using pools instead of threads. Nothing seems to be working.

Python code performance decreases with threading

EDIT: This thread discusses the same issue. Is it completely impossible to increase performance using multi-threading in python because of the GIL? Is the GIL causing my slowdowns?

EDIT 2 (2017-01-18): So from what I can gather after searching for quite a bit online it seems like python is really bad for parallelism. What I'm trying to do is parellelize a python function used in a neural network implemented in tensorflow...it seems like adding a custom op is the way to go.

Possible duplicate of Python code performance decreases with threading — Tagc
– Tagc, Commented Jan 16, 2017 at 12:29
@KlausD. Sorry, I missed that. I updated the code and the output. Still have the same issue despite not starting that ridiculous amount of threads! — LiquidFunk
– LiquidFunk, Commented Jan 16, 2017 at 12:43
Yes, the GIL will mess with your performance. Can't you change your code to use multiprocessing? — yorodm
– yorodm, Commented Jan 16, 2017 at 13:30

Dervin Thunk · Accepted Answer · 2017-01-16 12:44:00Z

The number of issues here is quite... numerous. Too many (system!) threads that do too little work, the GIL, etc. This is what I consider a really good introduction to parallelism in Python:

https://www.youtube.com/watch?v=MCs5OvhV9S4

Live coding is awesome.

I updated the code and output moments before you posted. The insane amount of threads have been removed! I will check out the link though, thanks.

Collectives™ on Stack Overflow

Multi-threaded Python application slower than single-threaded implementation

1 Answer 1

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Linked

Related