4

I just came across a strange phenomenon with Pandas DataFrames, when setting index using DataFrame.set_index('some_index') the old column that was also an index is deleted! Here is an example:

import pandas as pd df = pd.DataFrame({'month': [1, 4, 7, 10],'year': [2012, 2014, 2013, 2014],'sale':[55, 40, 84, 31]}) df_mn=df.set_index('month') >>> df_mn sale year month 1 55 2012 4 40 2014 7 84 2013 10 31 2014 

Now I change the index to year:

df_mn.set_index('year') sale year 2012 55 2014 40 2013 84 2014 31 

.. and the month column was removed with the index. This is vary irritating because I just wanted to swap the DataFrame index.

Is there a way to not have the previous column that was an index from being deleted? Maybe through something like: DataFrame.set_index('new_index',delete_previous_index=False)

Thanks for any advice

3
  • Use df_mn.reset_index() first which will re-store the month column in the dataframe. Then use set_index. Commented Jul 17, 2018 at 6:33
  • Answers under this question: python - Pandas - Dataframe.set_index - how to keep the old index column suggest using the argument append=True to set_index(). Commented Feb 20, 2019 at 10:14
  • @PaulRougieux Do you have any idea why df.set_index sometimes removes the existing index, but not always? Commented Oct 31, 2019 at 16:30

4 Answers 4

7

You can do the following

>>> df_mn.reset_index().set_index('year') month sale year 2012 1 55 2014 4 40 2013 7 84 2014 10 31 
Sign up to request clarification or add additional context in comments.

1 Comment

This serves the purpose but the year is misplaced in the row. Any fix for that ?
1

Both of the current top answers solve part of the problem.

The complete solution is:

df_mn.reset_index().set_index('year', drop=False) 

Calling .reset_index() first will convert the initial index into a column, so you will not lose it when calling set_index().

Adding drop=False to the set_index call means that year is not dropped when set as the index.

Comments

0

the solution I found to reatain a previous columns is to set drop=False dataframe.set_index('some_column',drop=False). This is not the perfect answer but it works!

2 Comments

I think this prevents the year column from disappearing in the DF when you set_index(year). It doesn't restore month if it is already used as the index. Kay Wittig's answer is best - reset_index()
Actually, this would help the user a lot, if applied when you initially do df.set_index("month", drop=False). This would prevent the month column from being lost, which would save 1 line in aspiring1's answer.
0

No, in such cases you have to save your previous column, like shown below:

import pandas as pd df = pd.DataFrame({'month': [1, 4, 7, 10],'year': [2012, 2014, 2013, 2014],'sale':[55, 40, 84, 31]}) df_mn=df.set_index('month') df_mn['month'] = df_mn.index #Save it as another column, and then run set_index with year column as value. df_mn.set_index('year') 

Besides you are using a duplicate dataframe df_mn , so the dataframe df remains unchanged you can use it again. And also if you aren't setting the

inplace argument for set_index to True

df_mn won't have changed even after you call set_index() on it.

Also, like the other answer you can always use reset_index().

2 Comments

Apart from Kay's answer I am not sure what are you suggesting. Can you explain?
I just wanted to suggest that you can basically save your index column after doing set_index with inplace parameter true. But, the best method is using reset_index, besides just like I suggested, your df column isn't affected, so the month data isn't lost, after all.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.