You don't need to return anything, because your operations are done in place. You can do the in-place changes in your function:
def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): data.rename(index=str, columns={col_in: col_out}, inplace=True) data.set_index('Timestamp', inplace=True)
and any other references to the dataframe you pass into the function will see the changes made:
>>> import pandas as pd >>> df = pd.DataFrame({'interval_time': pd.to_datetime(['2010-11-01 00:00:00', '2010-11-01 00:05:00', '2010-11-01 00:10:00', '2010-11-01 00:15:00']), ... 'A': [0.0] * 4}, index=range(4)) >>> df A interval_time 0 0.0 2010-11-01 00:00:00 1 0.0 2010-11-01 00:05:00 2 0.0 2010-11-01 00:10:00 3 0.0 2010-11-01 00:15:00 >>> def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): ... data.rename(index=str, columns={col_in: col_out}, inplace=True) ... data.set_index('Timestamp', inplace=True) ... >>> rename_n_index(df, 'interval_time') >>> df A Timestamp 2010-11-01 00:00:00 0.0 2010-11-01 00:05:00 0.0 2010-11-01 00:10:00 0.0 2010-11-01 00:15:00 0.0
In the above example, the df reference to the dataframe shows the changes made by the function.
If you remove the inplace=True arguments, the method calls return a new dataframe object. You can store an intermediate result as a local variable, then apply the second method to the dataframe referenced in that local variable:
def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): renamed = data.rename(index=str, columns={col_in: col_out}) return renamed.set_index('Timestamp')
or you can chain the method calls directly to the returned object:
def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): return data.rename(index=str, columns={col_in: col_out})\ .set_index('Timestamp'))
Because renamed is already a new dataframe, you can apply the set_index() call in-place to that object, then return just renamed, as well:
def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): renamed = data.rename(index=str, columns={col_in: col_out}) renamed.set_index('Timestamp', inplace=True) return renamed
Either way, this returns a new dataframe object, leaving the original dataframe unchanged:
>>> def rename_n_index(data, col_in='tempus_interval_time', col_out='Timestamp'): ... renamed = data.rename(index=str, columns={col_in: col_out}) ... return renamed.set_index('Timestamp') ... >>> df = pd.DataFrame({'interval_time': pd.to_datetime(['2010-11-01 00:00:00', '2010-11-01 00:05:00', '2010-11-01 00:10:00', '2010-11-01 00:15:00']), ... 'A': [0.0] * 4}, index=range(4)) >>> rename_n_index(df, 'interval_time') A Timestamp 2010-11-01 00:00:00 0.0 2010-11-01 00:05:00 0.0 2010-11-01 00:10:00 0.0 2010-11-01 00:15:00 0.0 >>> df A interval_time 0 0.0 2010-11-01 00:00:00 1 0.0 2010-11-01 00:05:00 2 0.0 2010-11-01 00:10:00 3 0.0 2010-11-01 00:15:00
return df.rename(...).set_index(...)