0

I would like to strip all of the the punctuations (except the dot) from the beginning and end of a string, but not in the middle of it.

For instance for an original string:

@#%%.Hol$a.A.$% 

I would like to get the word .Hol$a.A. removed from the end and beginning but not from the middle of the word.

Another example could be for the string:

@#%%...&Hol$a.A....$% 

In this case the returned string should be ..&Hol$a.A.... because we do not care if the allowed characters are repeated.

The idea is to remove all of the punctuations( except the dot ) just at the beginning and end of the word. A word is defined as \w and/or a .

A practical example is the string 'Barnes&Nobles'. For text analysis is important to recognize Barnes&Nobles as a single entity, but without the '

How to accomplish the goal using Regex?

5
  • Possible duplicate of Remove all special characters, punctuation and spaces from string Commented May 2, 2016 at 5:25
  • @LaxmikantGurnalkar: It is not a duplicate. Commented May 2, 2016 at 5:28
  • 2
    @user2288043: Could you post more examples? Only one is not very useful for covering other cases there could be. Commented May 2, 2016 at 5:30
  • what if there are multiple dots like :- @#%%....Hol$a.A....$%? Commented May 2, 2016 at 5:36
  • I added more details about the problem so it could be useful for another person too. However, it is solved with the reply of @ByteCommander Thanks! Commented May 2, 2016 at 14:54

2 Answers 2

2

Use this simple and easily adaptable regex:

[\w.].*[\w.] 

It will match exactly your desired result, nothing more.

  • [\w.] matches any alphanumeric character and the dot
  • .* matches any character (except newline normally)
  • [\w.] matches any alphanumeric character and the dot

To change the delimiters, simply change the set of allowed characters inside the [] brackets.

Check this regex out on regex101.com

import re data = '@#%%.Hol$a.A.$%' pattern = r'[\w.].*[\w.]' print(re.search(pattern, data).group(0)) # Output: .Hol$a.A. 
Sign up to request clarification or add additional context in comments.

3 Comments

This is a generic one considering the user hasn't provided much details or more examples.
This works as a charm.... The number of dots was not important... Another example could be string @#%%....Hol$a.A....$% in this case the returned value should be ....Hol$a.A....
@user2288043 If this answer solved your problem, please accept it by clicking the check button on its left.
1

Depending on what you mean with striping the punctuation, you can adapt the following code :

import re res = re.search(r"^[^.]*(.[^.]*.([^.]*.)*?)[^.]*$", "@#%%.Hol$a.A.$%") mystr = res.group(1) 

This will strip everything before and after the dot in the expression. Warning, you will have to check if the result is different of None, if the string doesn't match.

3 Comments

This only works if the result string is delimited by dots, right? There's a more general (and more simple) expression for this job.
If there is an alphabet just before the dot then it wouldn't work: @#%%a.Hol$a.A.$%.
Yes, but the question lacked details. I assumed he wanted to strip anything before the first dot and after the last one. If you wanna strip exclusively certain characters, then regex is not the way to go.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.