-1

I want to create a new column and replace NA and not missing value with 0 and 1.

#df col1 1 3 5 6 

what I want:

#df col1 NewCol 1 1 3 1 0 5 1 0 6 1 

This is what I tried:

df['NewCol']=df['col1'].fillna(0) df['NewCol']=df['col1'].replace(df['col1'].notnull(), 1) 

It seems that the second line is incorrect.
Any suggestion?

3
  • 1
    df['NewCol']=df['col1'].notna().astype(int) Commented Jul 1, 2019 at 18:09
  • 1
    @WeNYoBen TypeError: data type not understood Commented Jul 1, 2019 at 18:12
  • @PeterChen use it as a string: 'int'. or use numpy data types too: docs.scipy.org/doc/numpy-1.14.0/reference/arrays.dtypes.html Commented Jul 1, 2019 at 18:23

2 Answers 2

1

You can try:

df['NewCol'] = [*map(int, pd.notnull(df.col1))] 

Hope this helps.

Sign up to request clarification or add additional context in comments.

Comments

1

First you will need to convert all 'na's into '0's. How you do this will vary by scope. For a single column you can use:

df['DataFrame Column'] = df['DataFrame Column'].fillna(0) 

For the whole dataframe you can use:

df.fillna(0) 

After this, you need to replace all nonzeros with '1's. You could do this like so:

for index, entry in enumerate(df['col']): if entry != 0: df['col'][index] = 1 

Note that this method counts 0 as an empty entry, which may or may not be the desired functionality.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.