2

This is the string I am dealing with:'5Nov20217Dec202110Jan2022'

The string could also be:

'5Nov2021 7Dec2021 10Jan2022' 

I would like to obtain a list like:

['5Nov2021','7Dec2021','10Jan2022'] 

I am currently using regex but to no avail:

re.findall('^\d{1,2}[a-zA-Z]{3}\d{4}$','5Nov20217Dec202110Jan2022') 

A regex solution is not a must.

1
  • 2
    You are almost there, Just remove ^ and $ Commented Jun 30, 2021 at 2:29

1 Answer 1

4

Based on the variability of your input, I suggest combining re with string slicing in a while loop:

def extract_dates(d): while d: if (k:=re.findall('^\d{1,2}[a-zA-Z]{3}\d{4}', d)): if not (l:=d[len(k[0]):]) or l[0].isdigit(): yield k[0] d = l continue if (k:=re.findall('^\d{1,2}[a-zA-Z]{3}\d{2}', d)): yield k[0] d = d[len(k[0]):] else: d = d[1:] dates = ['5Nov20217Dec202110Jan2022', '5Nov217Dec2110Jan22', '5Nov21 7Dec21 10Jan22'] results = [list(extract_dates(i)) for i in dates] 

Output:

[['5Nov2021', '7Dec2021', '10Jan2022'], ['5Nov21', '7Dec21', '10Jan22'], ['5Nov21', '7Dec21', '10Jan22']] 
Sign up to request clarification or add additional context in comments.

1 Comment

Is it possible to modify this to support '5Nov217Dec2110Jan22' and '5Nov21 7Dec21 10Jan22'?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.