It seems like some operations can be done in place on Pandas DataFrames but some cannot.
def add_col(df): df['c'] = 5 def test_concat(df): df = pd.concat([df,df], ignore_index=True) If I run these functions on a DataFrame, it will add a column called 'c', but it will not render the original DataFrame concatenated with itself.
Of course, I could just return the new DataFrame, but I was finding that it was impacting performance. I'm not saying that this behavior is necessarily wrong, but I'm wondering how you guys refactor a large function into smaller subfunctions without increasing memory usage and process time.