I am trying to create the following function. However, when I assign the function to the original dataframe, it becomes empty.
def remove_outliers(feature, df): q1 = np.percentile(df[feature], 25) q2 = np.percentile(df[feature], 50) q3 = np.percentile(df[feature], 75) iqr = q3-q1 lower_whisker = df[df[feature] <= q1-1.5*iqr][feature].max() upper_whisker = df[df[feature] <= q3+1.5*iqr][feature].max() return df[(df[feature] < upper_whisker) & (df[feature]>lower_whisker)] I am assigning as follows:
train = remove_outliers('Power',train)
lower_whiskerand/orupper_whiskerare set toNaNhence the result from the function is an empty DataFramedf[df[feature] <= q1-1.5*iqr][feature]ordf[df[feature] <= q3+1.5*iqr][feature]is coming out as an empty dataframe causing your output to return a empty dataframeNaNthen the result is an emtpy DataFrametrain[train['Power'] >= q1-1.5*iqr]['Power'].min()