I have the following data stored in a pandas.DataFrame object named df. The column id is a unique identifier and the remaining columns are irrelevant for the purpose of this question.
| id | x1 | x2 | x3 | |
|---|---|---|---|---|
| 0 | 01001 | 523.41 | 639673 | 1222.13 |
| 1 | 01002 | 54.832 | 33746 | 615.443 |
| 2 | 01003 | 48.3824 | 45196 | 934.142 |
I want to know if there's a way to group by id and use assign to add multiple numbered rows to each group.
In other words, I want to use range to add an arbitrary number of rows to each id. The desired result looks as follows:
| id | x1 | x2 | x3 | new_col | |
|---|---|---|---|---|---|
| 0 | 01001 | 523.41 | 639673 | 1222.13 | 2020 |
| 0 | 01001 | 523.41 | 639673 | 1222.13 | 2021 |
| 1 | 01002 | 54.832 | 33746 | 615.443 | 2020 |
| 1 | 01002 | 54.832 | 33746 | 615.443 | 2021 |
| 2 | 01003 | 48.3824 | 45196 | 934.142 | 2020 |
| 2 | 01003 | 48.3824 | 45196 | 934.142 | 2021 |
I was hoping something along these lines would work.
df = df.groupby('id').assign(new_col=range(2020, 2022))