2

I have a dataframe with some replicated rows

item h2 h3 h4 ---------------- foo v1 ... ... foo v2 ... ... foo v1 ... ... foo v2 ... ... foo v1 ... ... foo v2 ... ... foo v1 ... ... foo v2 ... ... bar v5 ... ... bar v6 ... ... bar v7 ... ... bar v5 ... ... bar v6 ... ... bar v7 ... ... 

My goal is to add a column (new_id) in this dataframe which indicates an incrementing count of duplicate blocks (block being a set of rows that have the same item name) prefixed with the value in the item column (if it helps, the replicated blocks will be consecutive)

item h2 h3 h4 new_id ----------------------- foo v1 ... ... foo1 foo v2 ... ... foo1 foo v1 ... ... foo2 foo v2 ... ... foo2 foo v1 ... ... foo3 foo v2 ... ... foo3 foo v1 ... ... foo4 foo v2 ... ... foo4 bar v5 ... ... bar1 bar v6 ... ... bar1 bar v7 ... ... bar1 bar v5 ... ... bar2 bar v6 ... ... bar2 bar v7 ... ... bar2 

Suggestions on how to accomplish this?

2 Answers 2

1

Use str.cat() to concat column item with the cummulative count of each group in h2. Obviously the cummulative count begins from zero, offset it by 1

df.item.str.cat((df.groupby('h2').cumcount()+1).astype(str),sep='') item h2 h3 h4 new_id 0 foo v1 ... ... foo1 1 foo v2 ... ... foo1 2 foo v1 ... ... foo2 3 foo v2 ... ... foo2 4 foo v1 ... ... foo3 5 foo v2 ... ... foo3 6 foo v1 ... ... foo4 7 foo v2 ... ... foo4 8 bar v5 ... ... bar1 9 bar v6 ... ... bar1 10 bar v7 ... ... bar1 11 bar v5 ... ... bar2 12 bar v6 ... ... bar2 13 bar v7 ... ... bar2 
Sign up to request clarification or add additional context in comments.

Comments

1

Use GroupBy.cumcount by both columns item and h2:

df['new_id'] = df['item'] + '_' + df.groupby(['item','h2']).cumcount().add(1).astype(str) print (df) item h2 h3 h4 new_id 0 foo v1 ... ... foo_1 1 foo v2 ... ... foo_1 2 foo v1 ... ... foo_2 3 foo v2 ... ... foo_2 4 foo v1 ... ... foo_3 5 foo v2 ... ... foo_3 6 foo v1 ... ... foo_4 7 foo v2 ... ... foo_4 8 bar v5 ... ... bar_1 9 bar v6 ... ... bar_1 10 bar v7 ... ... bar_1 11 bar v5 ... ... bar_2 12 bar v6 ... ... bar_2 13 bar v7 ... ... bar_2 

1 Comment

Thanks for the quick answer! I just updated my question to clarify the example.. sorry for the confusion. Can you look at it again and update your answer? Thanks a lot!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.