I have a list special characters. For example
BAD_CHARS = ['.', '&', '\(', '\)', ';', '-'] I want to remove all the rows from a pandas dataframe column containing these special characters. currently I am doing the following
df = ''' words frequency & 11 CONDUCTED 3 (E.G., 5 EXPERIMENT 6 (VS. 5 (WARD 3 - 14 2006; 3 3D 5 ABLE 5 ABSTRACT 3 ACCOMPANIED 5 ACTIVITY 11 AD 5 ADULTS 6 ''' for char in BAD_CHARS: df = df[~df['word'].str.contains(char)] # Expected Result words frequency CONDUCTED 3 EXPERIMENT 6 3D 5 ABLE 5 ABSTRACT 3 ACCOMPANIED 5 ACTIVITY 11 AD 5 ADULTS 6 First it is not working and secondly it is not fast i guess. So how can I do that in a faster way ? Thanks
BAD_CHARS = ['.', '&', '(', ')', ';', '-']. Next, you can either use a character class, or usere.escape. Something like this.df[~df['words'].str.contains("[{}]".format(''.join(BAD_CHARS)))]