3

I have a dataframe (denoted as 'df') where some values are missing in a column (denoted as 'col1').

I applied a set function to find unique values in the column:

print(set(df['col1'])) Output: {0.0, 1.0, 2.0, 3.0, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan} 

I am trying to drop these 'nan' rows from the dataframe where I have tried this:

df['col1'] = df['col1'].dropna() 

However, the column rows remain unchanged.

I'm thinking that the above repeated 'nan' values in the above set may not be normal behaviour.

Any suggestions on how to remove these values?

2 Answers 2

5

I think what you're doing is taking one column from a DataFrame, removing all the NaNs from it, but then adding that column to the same DataFrame again - where any missing values from the index will be filled by NaNs again.

Do you want to remove that row from the entire DataFrame? If yes, try df.dropna(subset=["col1"])

Sign up to request clarification or add additional context in comments.

Comments

4

Marko Knöbl explains it well, problem is that you assign the dropped Series back, you can also try

df = df[df['col1'].notna()] 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.