1

Instead of doing this:

df['A'] = df['A'] if 'A' in df else None df['B'] = df['B'] if 'B' in df else None df['C'] = df['C'] if 'C' in df else None df['D'] = df['D'] if 'D' in df else None ... 

I want to do this in one line or function. Below is what I tried:

def populate_columns(df): col_names = ['A', 'B', 'C', 'D', 'E', 'F', ...] def populate_column(df, col_name): df[col_name] = df[col_name] if col_name in df else None return df[col_name] df[col_name] = df.apply(lambda x: populate_column(x) for x in col_names) return df 

But I just get Exception has occurred: ValueError. What can I do here?

1 Answer 1

1

Looks like you can replace your whole code with a reindex:

ensure_cols = ['A', 'B', 'C', 'D'] df = df.reindex(columns=df.columns.union(ensure_cols)) 

NB. By default the fill value is NaN, if you really want None use fill_value=None.

If you want to fix your code, just use a single loop:

col_names = ['A', 'B', 'C', 'D'] for c in col_names: if c not in df: df[c] = None 
Sign up to request clarification or add additional context in comments.

4 Comments

I tried using fill_value=None but still returns NaN
@amnesic strange, indeed. Then a workaround could be: reindex(..., fill_value='None').replace('None', None) or use any string that you know is not in the data if 'None' can be.
What's the point of self.df.columns.union(ensure_columns)? I can just pass in ensure columns, can't I?
It is just in case you have original columns that are not in the list.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.