2

I am trying to replace the null values with N/A. I have tried with following code but none of them works:

df.withColumn("series_name", when($"series_name") === null,"n/a") .otherwise($series_name) 

and

df.withColumn("series_name", when(col("series_name") === null,"n/a") 

what am I missing?

 +--------------------+ | series_name| +--------------------+ |Families of the M...| | null| | Ridiculousness| | null| | null| +--------------------+ 

2 Answers 2

4

You could also use the .fillna() method:

df.fillna('N/A', subset=['series_name']) 
Sign up to request clarification or add additional context in comments.

Comments

0

I prefer to use coalesce.

from pyspark.sql import functions as f df.withColumn('series_name', f.expr("coalesce(series_name, 'n/a')")) 

4 Comments

What does coalesce do? Do you have a doc reference?
it is same as ifnull function of mysql.
yeah, not a fan of sql :) The strange thing is that it is the same word as the operation that reduce partition number of a dataset. That's a little confusing
it is a spark function and you can use it like col or lit. I know that is the same name with partition collector :)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.