2

I figure out how to remove special non alphanumeric character in python using the following function:

p_nonalphanum = re.compile('\W+') def removeNonAlphaNum(string): m = p_nonalphanum.match(string) if m: string = string[m.end():] return string 

I would like to keep some specials characters though such as ½, ¾ which i consider as number. How should i edit my regex?

e.g: from "• ½ cup flour" to "½ cup flour"

2
  • 1
    Are you sure this is working for you so far? >>> removeNonAlphaNum('-hello-') 'hello-' Commented May 25, 2012 at 9:33
  • Only non alpha num characters starting in a string matters to me but this could be extended Commented May 25, 2012 at 10:21

3 Answers 3

3

You can use a negated character class and add all characters you want to keep

You could do something like this:

p_nonalphanum = re.compile('[^\w½¾]+') print (p_nonalphanum.sub('', test)) 
Sign up to request clarification or add additional context in comments.

2 Comments

this sounds like the first good solution as i want to edit my regex but I have this error message: SyntaxError: Non-ASCII character '\xc2' in file
ok i just need to add this first line in my python script: # -- coding: utf-8 --
2
>>> def remove_unwanted(s): ... '''• ½ cup flour -> ½ cup flour''' ... allowed = '[\w½¾]+' ... return ' '.join(re.findall(allowed, s)) ... >>> print remove_unwanted('• ½ cup flour ->') ½ cup flour 

Comments

2

Don't bother with a regex where you manually add each character you want, use the builtin isalnum function!

>>> s = "• ½ cup flour -> ½ cup flour" >>> def only_alphanum(s): ... s = unicode(s, "utf-8") ... return ' '.join(c for c in s.split() if c.isalnum()) ... >>> print only_alphanum(s) ½ cup flour ½ cup flour 

This will let you catch any fraction, instead of just the list of fractions that you've assembled in your regex(which could get long very quickly)

>>> s = "• ¼ cup oats -*(*&!!" >>> print only_alphanum(s) ¼ cup oats 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.