I'm new in python threading and I'm experimenting this: When I run something in threads (whenever I print outputs), it never seems to be running in parallel. Also, my functions take the same time that before using the library concurrent.futures (ThreadPoolExecutor). I have to calculate the gains of some attributes over a dataset (I cannot use libraries). Since I have about 1024 attributes and the function was taking about a minute to execute (and I have to use it in a for iteration) I dicided to split the array of attributes into 10 (just as an example) and run the separete function gain(attribute) separetly for each sub array. So I did the following (avoiding some extra unnecessary code):
def calculate_gains(self): splited_attributes = np.array_split(self.attributes, 10) result = {} for atts in splited_attributes: with concurrent.futures.ThreadPoolExecutor() as executor: future = executor.submit(self.calculate_gains_helper, atts) return_value = future.result() self.gains = {**self.gains, **return_value} Here's the calculate_gains_helper:
def calculate_gains_helper(self, attributes): inter_result = {} for attribute in attributes: inter_result[attribute] = self.gain(attribute) return inter_result Am I doing something wrong? I read some other older posts but I couldn't get any info. Thanks a lot for any help!
for atts in splited_attributes:you are creating a thread executor, submitting a single work item and then waiting for it to complete for eachattsin the for loop. That is way more expensive than just doing the calcuation single threaded. You should create the executor once and throw all of the jobs at it.