2

I have a tar file whose name I am successfully able to read and store in a variable,

tarname = 'esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar' 

But how do I extract just "Mona" from this file name and store it in a variable?

(The filename structure for the tar file will be same as above for all tar files with the name occuring after "es-archive--{Name}-AB" , so a solution which returns any name obeying this format)

Thanks!

3
  • Can the name include dashes? If not, I'd be tempted to go with tarname.split('-')[2]. Commented May 31, 2017 at 18:16
  • We need more info on the set of possible filenames you may encounter to answer this. Commented May 31, 2017 at 18:17
  • The name would be a plain first name like yours and mine but the dashes before it and after it are part of the original file that I receive for various people. Like, --Jamy-AB Commented May 31, 2017 at 18:18

3 Answers 3

11

parse module is good for this kind of stuff. You may think of it as the inverse of str.format.

from parse import parse pattern = 'esarchive--{Name}-AB-{otherstuff}.tar' result = parse(pattern, tarname) 

Demo:

>>> result = parse(pattern, tarname) >>> result['Name'] 'Mona' >>> result.named {'Name': 'Mona', 'otherstuff': 'Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4'} 
Sign up to request clarification or add additional context in comments.

1 Comment

Best answer for general use.
4

Easiest way I can think of:

  1. Split the filename on the - character.
  2. Get the 3rd item from the resulting list (index 2).

In code:

filename.split('-')[2] 

Simple one-liner. This is of course working off your example. I would need more sample filenames to account for possible variations and know for certain if this will always work.

1 Comment

The problem with these kind of approach is that if the data is not always exactly as expected, you can just get an incorrect result (silently, when you would prefer to have some kind of unhandled Exception raised instead)
3
>>> import re >>> tarname = "esarchive--Mona-AB-Test226-8037affd-06d1-4c61-a91f-816ec9cb825f-05222017-4.tar" >>> s = re.match("esarchive--(\w+)-AB", tarname).group(1) >>> s 'Mona' 

1 Comment

Really helpful. Worked like a charm with my existing code!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.