3

I want to use Python regular expression utility to find the files which has this pattern:

000014_L_20111026T194932_1.txt 000014_L_20111026T194937_2.txt ... 000014_L_20111026T194928_12.txt 

So the files I want have an underscore '_' followed by a number (1 or more digits) and then followed by '.txt' as the extension. I used the following regular expression but it didn't match the above names:

match = re.match('_(\d+)\.txt$', file) 

What should be the correct regex to match the file names?

2

1 Answer 1

14

You need to use .search() instead; .match() anchors to the start of the string. Your pattern is otherwise fine:

>>> re.search('_(\d+)\.txt$', '000014_L_20111026T194928_12.txt') <_sre.SRE_Match object at 0x10e8b40a8> >>> re.search('_(\d+)\.txt$', '000014_L_20111026T194928_12.txt').group(1) '12' 
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Martijin. This works. So even if I use '$' to indicate from the end of string, .match function doesn't search from the end?
@tonga: No; $ is an anchor, it'll only match at the end of the string; it won't dictate where the search begins. You should see .match() as adding an implicit ^ to your pattern.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.