4

I have a list special characters. For example

BAD_CHARS = ['.', '&', '\(', '\)', ';', '-'] 

I want to remove all the rows from a pandas dataframe column containing these special characters. currently I am doing the following

df = ''' words frequency & 11 CONDUCTED 3 (E.G., 5 EXPERIMENT 6 (VS. 5 (WARD 3 - 14 2006; 3 3D 5 ABLE 5 ABSTRACT 3 ACCOMPANIED 5 ACTIVITY 11 AD 5 ADULTS 6 ''' for char in BAD_CHARS: df = df[~df['word'].str.contains(char)] # Expected Result words frequency CONDUCTED 3 EXPERIMENT 6 3D 5 ABLE 5 ABSTRACT 3 ACCOMPANIED 5 ACTIVITY 11 AD 5 ADULTS 6 

First it is not working and secondly it is not fast i guess. So how can I do that in a faster way ? Thanks

3
  • @Zero mark it, please. Commented Jan 17, 2018 at 13:17
  • 1
    First, don't escape the braces. BAD_CHARS = ['.', '&', '(', ')', ';', '-']. Next, you can either use a character class, or use re.escape. Something like this. df[~df['words'].str.contains("[{}]".format(''.join(BAD_CHARS)))] Commented Jan 17, 2018 at 13:20
  • If you have issues copying that, just type it out. Commented Jan 17, 2018 at 13:24

1 Answer 1

5

I believe you need first escape values and then join by | and as @cᴏʟᴅsᴘᴇᴇᴅ pointed remove \ from values in BAD_CHARS:

import re BAD_CHARS = ['.', '&', '(', ')', ';', '-'] pat = '|'.join(['({})'.format(re.escape(c)) for c in BAD_CHARS]) df = df[~df['words'].str.contains(pat)] print (df) words frequency 1 CONDUCTED 3 3 EXPERIMENT 6 8 3D 5 9 ABLE 5 10 ABSTRACT 3 11 ACCOMPANIED 5 12 ACTIVITY 11 13 AD 5 14 ADULTS 6 

because this return empty frame:

df[~df['word'].str.contains('|'.join(BAD_CHARS))] 
Sign up to request clarification or add additional context in comments.

4 Comments

It returns empty frame :(
The question was closed as a dupe, and I've addressed the specifics of their question** in a comment. Or else, I could have posted the answer myself :/
Thanks. How easy it was :)
@cᴏʟᴅsᴘᴇᴇᴅ - I dont understand Or else, I could have posted the answer myself :/ Do you think I copy your comment answer? I use only part of comment - dont escape it, and add mentioned it.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.