0

I have below DF with null values in some columns.

Now I need to update/replace those 'null' values with 'NA'

+-------+------+-----+------+----+ |Product|Canada|China|Mexico| USA| +-------+------+-----+------+----+ | Orange| null| 4000| null|4000| | Beans| null| 1500| 2000|1600| | Banana| 2000| 400| null|1000| |Carrots| 2000| 1200| null|1500| +-------+------+-----+------+----+ 

I found the method 'fillna' to replace the null value

however I need to update/replace all column having null values

So something like this or better way

replaced = df.fillna({str(col):'NA', col for col in df.columns}) 

Appreciate any help to get the right approach

Thanks

1
  • What is the data type of these columns (other than product)? Can you add the schema? Commented Nov 5, 2020 at 10:27

1 Answer 1

2

You need to use subset() and pass the column name in order fill with Null values

df = df.fillna(0, subset=['Canada', 'China', 'Mexico', 'USA'])

or , in case if you want to use fillna() for all the columns , pass them in a dictionary , also you can specify your choice :)

df = df.fillna({'Canada':'4', 'China': '5', 'Mexico' : '6', 'USA': '7})

Or, you can simply use below to fill all the columns with null values

df = df.fillna("a_value")

Sign up to request clarification or add additional context in comments.

4 Comments

Hi @dsk.. i tried all the way you suggested, but none of those giving output as expected.. it was still retuning with null value
Can you try converting to StringType and fill with na. and check
let me know please where i supposed to convert to a string...
df = df.withColumn("Canada", F.col(Canada).cast(T.StringType())) - try this

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.