1

Folks,

I've searched StackOverflow for my use-case but haven't been able to find anything useful. If you feel this problem is already solved, please point to the appropriate question.

Use-case.

I have the following data-frame.

 Maturity,Periods 0.5,2 0.5,2 1.0,3 1.0,3 1.0,3 

As you can see, the maturity column is repeated based on the number in the periods column. Now what I want to accomplish is create a new column which will have all 0s except 1 value for each grouped maturity. So expected dataframe is something like this

 Maturity,Periods,CP 0.5,2,0 0.5,2,1 1.0,3,0 1.0,3,0 1.0,3,1 

As you can see in the expected dataframe, the number of 0s in the CP column is 1 less than the value in the Periods column and the remaining value is 1.

I tried the below pandas groupby operation but it fails.

new_df['CP'] = new_df.groupby(['Maturity'])['Periods'].apply(lambda x: np.zeros((x-1, 1)) + np.array([1.0])).reset_index() 

Can somebody point out where am I going wrong?

UPDATED EDIT:

As a follow-up to the above question, how would the below approach be solved using Pandas' operations?

Using this above dataframe, I want to create new column but the expected output is something like this:

Maturity,Periods,CP,TimeCF 0.5,2,0,0.5 0.5,2,1,0.5 1.0,3,0,0.5 1.0,3,0,1.0 1.0,3,1,1.0 1.5,4,0,0.5 1.5,4,0,1.0 1.5,4,0,1.5 1.5,4,1,1.5 

The new column of TimeCF will have values of time of the cash flows (considering semi-annual cash flows of the bond)

1 Answer 1

1

Doesn't seem like you need a groupby here... try this:

df['CP'] = 0 df.loc[df['Maturity'].ne(df['Maturity'].shift(-1)), 'CP'] = 1 print(df) Maturity Periods CP 0 0.5 2 0 1 0.5 2 1 2 1.0 3 0 3 1.0 3 0 4 1.0 3 1 

If groupby is unavoidable, you can use it in a similar fashion as before:

df['CP'] = 0 df.loc[df.groupby('Maturity').apply(lambda x: x.index[-1]), 'CP'] = 1 print(df) Maturity Periods CP 0 0.5 2 0 1 0.5 2 1 2 1.0 3 0 3 1.0 3 0 4 1.0 3 1 
Sign up to request clarification or add additional context in comments.

6 Comments

Yes, it worked. Didn't know about this ne operation before. Thanks.
How would you solve the updated question edit? I tried using some combination of groupby and reset_index() but couldn't get as expected.
@sgokhales I'm not at my desk now, so if you could wait a few hours, I'll take a look at it. Otherwise, if you're in a hurry, I'd suggest opening a new question as an extension to this one
np. you can check later.
@sgokhales Okay, can you please explain this: "The new column of TimeCF will have values of time of the cash flows (considering semi-annual cash flows of the bond)" you just said a lot of things with little context (what are cash flows? bond? semi-annual?)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.