I have a Dataframe:
ID | program | --------|-----------| 53-8975 | null | 53-9875 | null | 53A7569 | | 53-9456 | XXXX | 53-9875 | | --------------------- The ID and the program are String. I want to fill all null or "" in program column by the letter K and if the 4th digit in the ID column is 9. For example:
I have two ID that there 4th is 9: 53-9875 and 53-9456 and the values of program column is respectively are: null and ""
How can I fill the program column by the letter K if the 4th digit in the ID column is 9 and the program is null or "" using pyspark.
To be my Dataframe:
ID | program | --------|-----------| 53-8975 | null | 53-9875 | K | 53A7569 | | 53-9456 | XXXX | 53-9875 | K | ---------------------