12

How to get the string between two points using regex or any other library in Python 3?

For eg: Blah blah ABC the string to be retrieved XYZ Blah Blah

ABC and XYZ are variables which denote the start and end of the string which I have to retrieve.

2 Answers 2

16

Use ABC and XYZ as anchors with look-behind and look-ahead assertions:

(?<=ABC).*?(?=XYZ) 

The (?<=...) look-behind assertion only matches at the location in the text that was preceded by ABC. Similarly, (?=XYZ) matches at the location that is followed by XYZ. Together they form two anchors that limit the .* expression, which matches anything.

You can find all such anchored pieces of text with re.findall():

for matchedtext in re.findall(r'(?<=ABC).*?(?=XYZ)', inputtext): 

If ABC and XYZ are variable, you want to use re.escape() (to prevent any of their content from being interpreted as regular expression syntax) on them and interpolate:

re.match(r'(?<={}).*?(?={})'.format(abc, xyz), inputtext) 
Sign up to request clarification or add additional context in comments.

Comments

6

I think this is what you want:

import re match = re.search('ABC(.*)XYZ','Blah blah ABC the string to be retrieved XYZ Blah Blah') print match.group(1) 

3 Comments

good way, but add a question mark after the * quantifier to make it lazy if you don't want to match something like ABC( ... XYZ ... )XYZ
@user1572508 ABC and XYZ are variables to be taken from the user
What if I just want the content in '(.*)' ?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.