Supposed, I have Pandas DataFrame looks like below:
| Cluster | Variable | Group | Ratio | Value |
|---|---|---|---|---|
| 1 | GDP_M3 | GDP | 20% | 70% |
| 1 | HPI_M6 | HPI | 40% | 80% |
| 1 | GDP_lg2 | GDP | 35% | 50% |
| 2 | CPI_M9 | CPI | 10% | 50% |
| 2 | HPI_lg6 | HPI | 15% | 65% |
| 3 | CPI_lg12 | CPI | 15% | 90% |
| 3 | CPI_lg1 | CPI | 20% | 95% |
I would like to rank Variable based on Ratio and Value in the separated columns. The Ratio will rank from the lowest to the highest, while the Value will rank from the highest to the lowest.
There are some variables that I do not want to rank. In the example, I do not prefer CPI. Any type of CPI will not be considered for the rank e.g., CPI_M9. However, the case will be expected only if there is only that particular variable in the Cluster.
The results from condition above will look like the table below:
| Cluster | Variable | Group | Ratio | Value | RankRatio | RankValue |
|---|---|---|---|---|---|---|
| 1 | GDP_M3 | GDP | 20% | 70% | 1 | 2 |
| 1 | HPI_M6 | HPI | 40% | 80% | 3 | 1 |
| 1 | GDP_lg2 | GDP | 35% | 50% | 2 | 3 |
| 2 | CPI_M9 | CPI | 10% | 50% | NaN | NaN |
| 2 | HPI_lg6 | HPI | 15% | 65% | 1 | 1 |
| 3 | CPI_lg12 | CPI | 15% | 90% | 1 | 2 |
| 3 | CPI_lg1 | CPI | 20% | 95% | 2 | 1 |
For Cluster 1, the GDP_M3 has the lowest Ratio at 20%, while the HPI_M3 has the highest Value at 80%. Thus, both of them will be assigned rank 1 and the others will be followed subsequently.
For Cluster 2, even CPI_M9 has the lowest Ratio but the CPI is not prefer. Thus, the rank 1 will be assigned to HPI_lg6.
For Cluster 3, there are variables from the only CPI Group and there is no other options to rank. Thus, the CPI_lg12 and CPI_lg1 are ranked based on the lowest Ratio and the highest Value.
df['RankRatio'] = df.groupby(['Cluster'])['Ratio'].rank(method = 'first', ascending = True) df['RankValue'] = df.groupby(['Cluster'])['Value'].rank(method = 'first', ascending = False) I have some code that can be handled only general case but for specific case with unprefer group of variables, my code cannot handle it.
Please help or suggest on this. Thank you.