0

I am trying to find a way to parallelise certain operations on dataframes, especially those that cannot be vectorised. I have tested the code below, taken from http://www.racketracer.com/2016/07/06/pandas-in-parallel/ , but it doesn't work. No error message - quite simply, nothing happens. Debugging it, it seems the code gets stuck at df = pd.concat(pool.map(func, df_split)) , but without any error messages.

What am I doing wrong?

import timeit import pandas as pd import numpy as np import seaborn as sns import multiprocessing from multiprocessing import Pool def parallelize_dataframe(df, func): df_split = np.array_split(df, num_partitions) pool = Pool(num_cores) df = pd.concat(pool.map(func, df_split)) pool.close() pool.join() return df def multiply_columns(data): data['length_of_word'] = data['species'].apply(lambda x: len(x)) return data num_partitions = 2 #number of partitions to split dataframe num_cores = 2# multiprocessing.cpu_count() #number of cores on your machine iris = pd.DataFrame(sns.load_dataset('iris')) iris = parallelize_dataframe(iris, multiply_columns) 
1
  • Is there a reason why you are not using e.g. dask? Commented Mar 20, 2019 at 10:58

1 Answer 1

0

I needed to add

if __name__ == "__main__": 
Sign up to request clarification or add additional context in comments.

4 Comments

Please use the edit link on your question to add additional information. The Post Answer button should be used only for complete answers to the question. - From Review
I am not following. The complete answer to the question is that parallelize(dataframe) must be run only if name=="__main" , which is what I have written. I could have made it more explicit, but it seemed pretty obvious to me
Please include the lines before and after this if-statement you want to include. (Always think of your posts here as entries in a knowledge base, not just a chat)
This looked more like a comment saying that you forgot to add something to your question. Anyways cheers!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.