Let's say I have a pyspark DF:
| Column A | Column B | | -------- | -------- | | val1 | val1B | | null | val2B | | val2 | null | | val3 | val3B | Can someone help me with replacing any null value in any column (for the whole df) with the value right below it? So the final table should look like this:
| Column A | Column B |
|---|---|
| val1 | val1B |
| val2 | val2B |
| val2 | val3B |
| val3 | val3B |
How could this be done? Can I get a code demo if possible? Thank you!
All I've really gotten through is counting all the row nums and creating a condition to find the row nums with all of the null values. So I'm left with a table like this:
| Column A | Column B | row_num |
|---|---|---|
| null | val2B | 2 |
| val2 | null | 3 |
But I don't think this step is needed. I'm stuck as to what to do.