The following code transforms a given pandas column FEAT into a new, binary feature named STREAM. The program works as long as there are no NaN values in the original dataframe. If that is the case, the following exception occurs: ValueError: Length of values does not match length of index. I need to push the NaN values to the new column. Is it doable? Here is the code option that fails:
import pandas as pd import numpy as np data = { 'FEAT': [8, 15, 7, np.nan, 5, 2, 11, 15] } customer = pd.DataFrame(data) customer = pd.DataFrame(data, index=['June', 'Robert', 'Lily', 'David', 'Bob', 'Sally', 'Mia', 'Luis']) #create binary variable STREAM 0:mainstream 1:avantgarde stream_0 = [1, 3, 5, 8, 10, 12, 14] stream_1 = [2, 4, 6, 7, 9, 11, 13, 15] # convert FEAT to list_0 list_0 = customer['FEAT'].values.tolist() # create a list of length = len(customer) whose elements are: # 0 if the value of 'FEAT' is in stream_0 # 1 if the value of 'FEAT' is in stream_1 L = [] for i in list_0: if i in stream_0: L.append(0) elif i in stream_1: L.append(1) # convert the list to a new column of customer df customer['STREAM'] = L print(customer)
elseblock, which is missing (i.e. not in stream_0 or stream_1)