0

I have a pandas df. Say I have a column "activity" which can be "fun" or "work" and I want to convert it to an integer. What I do is:

df["activity_id"] = 1*(df["activity"]=="fun") + 2*(df["activity"]=="work") 

This works, since I do not know how to put an if/else in there (and if you have 10 activities it can get complicated).

However, say I now have the opposite problem, and I want to convert from an id to a string, I cannot use this trick anymore because I cannot multiply a string with a Boolean. How do I do it? Is there a way to use if/else?

3 Answers 3

6

You can create a dictionary with id as the key and the string as the value and then use the map series method to convert the integer to a string.

my_map = {1:'fun', 2:'work'} df['activity']= df.activity_id.map(my_map) 
Sign up to request clarification or add additional context in comments.

1 Comment

This is the best and easiest answer.
2

You could instead convert your activity column to categorical dtype:

df['activity'] = pd.Categorical(df['activity']) 

Then you would automatically have access to integer labels for the values via df['activity'].cat.codes.


import pandas as pd df = pd.DataFrame({'activity':['fun','work','fun']}) df['activity'] = pd.Categorical(df['activity']) print(df['activity'].cat.codes) 0 0 1 1 2 0 dtype: int8 

Meanwhile the string values can still be accessed as before while saving memory:

print(df) 

still yields

 activity 0 fun 1 work 2 fun 

Comments

1

You could also use a dictionary and list comprehension to recalculate values for an entire column. This makes it easy to define the reverse mapping as well:

>>> import pandas as pd >>> forward_map = {'fun': 1, 'work': 2} >>> reverse_map = {v: k for k, v in forward_map.iteritems()} >>> df = pd.DataFrame( {'activity': ['work', 'work', 'fun', 'fun', 'work'], 'detail': ['reports', 'coding', 'hiking', 'games', 'secret games']}) >>> df activity detail 0 work reports 1 work coding 2 fun hiking 3 fun games 4 work secret games >>> df['activity'] = [forward_map[i] for i in df['activity']] >>> df activity detail 0 2 reports 1 2 coding 2 1 hiking 3 1 games 4 2 secret games >>> df['activity'] = [reverse_map[i] for i in df['activity']] >>> df activity detail 0 work reports 1 work coding 2 fun hiking 3 fun games 4 work secret games 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.