The dataframe below has 4 columns: runner_name,race_date, height_in_inches,top_ten_finish.
I want to groupby race_date, and if the runner finished in the top ten for that race_date, rank his height_in_inches among only the other runners who finished in the top ten for that race_date. How would I do this?
This is the original dataframe:
>>> import pandas as pd >>> d = {"runner":['mike','paul','jim','dave','douglas'], ... "race_date":['2019-02-02','2019-02-02','2020-02-02','2020-02-01','2020-02-01'], ... "height_in_inches":[72,68,70,74,73], ... "top_ten_finish":["yes","yes","no","yes","no"]} >>> df = pd.DataFrame(d) >>> df runner race_date height_in_inches top_ten_finish 0 mike 2019-02-02 72 yes 1 paul 2019-02-02 68 yes 2 jim 2020-02-02 70 no 3 dave 2020-02-01 74 yes 4 douglas 2020-02-01 73 no >>> and this is what I'd like the result to look like. Notice how if they didn't finish in the top 10 of a race, they get a value of 0 for that new column.
runner race_date height_in_inches top_ten_finish if_top_ten_height_rank 0 mike 2019-02-02 72 yes 1 1 paul 2019-02-02 68 yes 2 2 jim 2020-02-02 70 no 0 3 dave 2020-02-01 74 yes 1 4 douglas 2020-02-01 73 no 0
Thank you!