Pandas Groupby and create new column with custom values

Question

Folks,

I've searched StackOverflow for my use-case but haven't been able to find anything useful. If you feel this problem is already solved, please point to the appropriate question.

Use-case.

I have the following data-frame.

 Maturity,Periods 0.5,2 0.5,2 1.0,3 1.0,3 1.0,3

As you can see, the maturity column is repeated based on the number in the periods column. Now what I want to accomplish is create a new column which will have all 0s except 1 value for each grouped maturity. So expected dataframe is something like this

 Maturity,Periods,CP 0.5,2,0 0.5,2,1 1.0,3,0 1.0,3,0 1.0,3,1

As you can see in the expected dataframe, the number of 0s in the CP column is 1 less than the value in the Periods column and the remaining value is 1.

I tried the below pandas groupby operation but it fails.

new_df['CP'] = new_df.groupby(['Maturity'])['Periods'].apply(lambda x: np.zeros((x-1, 1)) + np.array([1.0])).reset_index()

Can somebody point out where am I going wrong?

UPDATED EDIT:

As a follow-up to the above question, how would the below approach be solved using Pandas' operations?

Using this above dataframe, I want to create new column but the expected output is something like this:

Maturity,Periods,CP,TimeCF 0.5,2,0,0.5 0.5,2,1,0.5 1.0,3,0,0.5 1.0,3,0,1.0 1.0,3,1,1.0 1.5,4,0,0.5 1.5,4,0,1.0 1.5,4,0,1.5 1.5,4,1,1.5

The new column of TimeCF will have values of time of the cash flows (considering semi-annual cash flows of the bond)

cs95 · Accepted Answer · 2018-12-04 06:02:38Z

1

Doesn't seem like you need a groupby here... try this:

df['CP'] = 0 df.loc[df['Maturity'].ne(df['Maturity'].shift(-1)), 'CP'] = 1 print(df) Maturity Periods CP 0 0.5 2 0 1 0.5 2 1 2 1.0 3 0 3 1.0 3 0 4 1.0 3 1

If groupby is unavoidable, you can use it in a similar fashion as before:

df['CP'] = 0 df.loc[df.groupby('Maturity').apply(lambda x: x.index[-1]), 'CP'] = 1 print(df) Maturity Periods CP 0 0.5 2 0 1 0.5 2 1 2 1.0 3 0 3 1.0 3 0 4 1.0 3 1

edited Dec 4, 2018 at 6:02

answered Dec 4, 2018 at 6:00

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Saurabh Gokhale Over a year ago

Yes, it worked. Didn't know about this ne operation before. Thanks.

Saurabh Gokhale Over a year ago

How would you solve the updated question edit? I tried using some combination of groupby and reset_index() but couldn't get as expected.

cs95 Over a year ago

@sgokhales I'm not at my desk now, so if you could wait a few hours, I'll take a look at it. Otherwise, if you're in a hurry, I'd suggest opening a new question as an extension to this one

Saurabh Gokhale Over a year ago

np. you can check later.

cs95 Over a year ago

@sgokhales Okay, can you please explain this: "The new column of TimeCF will have values of time of the cash flows (considering semi-annual cash flows of the bond)" you just said a lot of things with little context (what are cash flows? bond? semi-annual?)

|

Collectives™ on Stack Overflow

Pandas Groupby and create new column with custom values

1 Answer 1

6 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Related