Pandas dataframe.set_index() deletes previous index and column

Question

I just came across a strange phenomenon with Pandas DataFrames, when setting index using DataFrame.set_index('some_index') the old column that was also an index is deleted! Here is an example:

import pandas as pd df = pd.DataFrame({'month': [1, 4, 7, 10],'year': [2012, 2014, 2013, 2014],'sale':[55, 40, 84, 31]}) df_mn=df.set_index('month') >>> df_mn sale year month 1 55 2012 4 40 2014 7 84 2013 10 31 2014

Now I change the index to year:

df_mn.set_index('year') sale year 2012 55 2014 40 2013 84 2014 31

.. and the month column was removed with the index. This is vary irritating because I just wanted to swap the DataFrame index.

Is there a way to not have the previous column that was an index from being deleted? Maybe through something like: DataFrame.set_index('new_index',delete_previous_index=False)

Thanks for any advice

Use df_mn.reset_index() first which will re-store the month column in the dataframe. Then use set_index. — Kay Wittig
– Kay Wittig, Commented Jul 17, 2018 at 6:33
Answers under this question: python - Pandas - Dataframe.set_index - how to keep the old index column suggest using the argument append=True to set_index(). — Paul Rougieux
– Paul Rougieux, Commented Feb 20, 2019 at 10:14
@PaulRougieux Do you have any idea why df.set_index sometimes removes the existing index, but not always? — saintsfan342000
– saintsfan342000, Commented Oct 31, 2019 at 16:30

Kay Wittig · Accepted Answer · 2018-07-17 06:36:27Z

7

You can do the following

>>> df_mn.reset_index().set_index('year') month sale year 2012 1 55 2014 4 40 2013 7 84 2014 10 31

answered Jul 17, 2018 at 6:36

Kay Wittig

5684 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Python Bang Over a year ago

This serves the purpose but the year is misplaced in the row. Any fix for that ?

Tyler · Accepted Answer · 2024-09-24 00:50:29Z

Both of the current top answers solve part of the problem.

The complete solution is:

df_mn.reset_index().set_index('year', drop=False)

Calling .reset_index() first will convert the initial index into a column, so you will not lose it when calling set_index().

Adding drop=False to the set_index call means that year is not dropped when set as the index.

icypy · Accepted Answer · 2018-07-17 04:29:37Z

0

the solution I found to reatain a previous columns is to set drop=False dataframe.set_index('some_column',drop=False). This is not the perfect answer but it works!

answered Jul 17, 2018 at 4:29

icypy

3,2127 gold badges28 silver badges29 bronze badges

2 Comments

Marc Maxmeister Over a year ago

I think this prevents the year column from disappearing in the DF when you set_index(year). It doesn't restore month if it is already used as the index. Kay Wittig's answer is best - reset_index()

rbatt Over a year ago

Actually, this would help the user a lot, if applied when you initially do df.set_index("month", drop=False). This would prevent the month column from being lost, which would save 1 line in aspiring1's answer.

aspiring1 · Accepted Answer · 2018-07-17 10:10:23Z

No, in such cases you have to save your previous column, like shown below:

import pandas as pd df = pd.DataFrame({'month': [1, 4, 7, 10],'year': [2012, 2014, 2013, 2014],'sale':[55, 40, 84, 31]}) df_mn=df.set_index('month') df_mn['month'] = df_mn.index #Save it as another column, and then run set_index with year column as value. df_mn.set_index('year')

Besides you are using a duplicate dataframe df_mn , so the dataframe df remains unchanged you can use it again. And also if you aren't setting the

inplace argument for set_index to True

df_mn won't have changed even after you call set_index() on it.

Also, like the other answer you can always use reset_index().

Apart from Kay's answer I am not sure what are you suggesting. Can you explain?
I just wanted to suggest that you can basically save your index column after doing set_index with inplace parameter true. But, the best method is using reset_index, besides just like I suggested, your df column isn't affected, so the month data isn't lost, after all.

Collectives™ on Stack Overflow

Pandas dataframe.set_index() deletes previous index and column

4 Answers 4

1 Comment

Comments

2 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

2 Comments

2 Comments

Linked

Related