2

It seems like some operations can be done in place on Pandas DataFrames but some cannot.

def add_col(df): df['c'] = 5 def test_concat(df): df = pd.concat([df,df], ignore_index=True) 

If I run these functions on a DataFrame, it will add a column called 'c', but it will not render the original DataFrame concatenated with itself.

Of course, I could just return the new DataFrame, but I was finding that it was impacting performance. I'm not saying that this behavior is necessarily wrong, but I'm wondering how you guys refactor a large function into smaller subfunctions without increasing memory usage and process time.

1 Answer 1

1

You ask an excellent question ... I was wondering whether using df = df.append(df) would reduce the performance impact?

Sign up to request clarification or add additional context in comments.

2 Comments

Good point, but defining the function as def test_function_no_return(df): df = df.append(df) Doesn't seem to change the DataFrame in place either. Do you mean returning df.append(df)?
Yes, append() doesn't operate in place. I meant returning df.append(df)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.