Let's say, I have the below data frame as sample.
name age status price 0 frank 12 1 100 1 jack 33 0 190 2 joe 22 1 200 ****************************** Desired output:
name age status price 0 frank 12 1 100 1 jack 33 0 190 2 jack NaN 0 -190 3 joe 22 1 200 ****************************** I also posted the sample data frame below, so you can test it easily.
df1 = pd.DataFrame({ "name":["frank", "jack", "joe"], "age": [12, 33, 22], "status": [1,0, 1], "price": [100,190, 200] }) As you can see, I want to insert a new row based on the above row, which status is 0, it's meant to be a transaction failure. to statistics more easily, I want to generate a new row below that one. I also want its price to be negative number. and since I don't care about whole column of the new row, so I want the other column be NaN, just like 'age' here in my desired output.
What I've tried so far.
import pandas as pd import numpy as np df1 = pd.DataFrame({ "name":["frank", "jack", "joe"], "age": [12, 33, 22], "status": [1,0, 1], "price": [100,190, 200] }) df2_list = [] for i, row in df1.iterrows(): if row["status"] == 0: origin_row = row.to_dict() new_row = ({ "name": origin_row.get("name"), #"age": origin_row.get("age"), "age": np.NaN, "status": origin_row.get("status"), "price": -origin_row.get("price"), }) df2_list.append(new_row) df2 = pd.DataFrame(df2_list) # concat df1 and df2 and sort it . df3 = pd.concat([df1, df2], ignore_index=True) df4 = df3.sort_values(['name', 'price'], ascending=[True, False]) print(df4) I have a loop, and check if it hit my condition status==0, and append it on my tmp list, and ...but it's too many code. I want to know is there any good way, I mean more Pythonic code or pandas has already got some function can it ?