1

I have a regular expression as defined

AAA_BBB_CCCC_(.*)_[0-9][0-9][0-9][0-9][0-1][0-9][0-3][0-9]T[0-2][0-9][0-5][0-9][0-5][0-9].

There is a string defined as --> **AAA_BBB_CCCC_DDD_EEEE_19710101T123456** and in the code, we have matcher.group(1) which can filter out what is desired as (DDD_EEEE). Now, I've a new string coming in as --> **AAA_BBB_ATCCCC_DDD_EEEE_19710101T123456**. Is there a way that I can change the regex to satisfy both old and new string? I tried few solutions that came up from Stackoverflow questions like this and others but that didn't work quite right for me.

6
  • 4
    See regex101.com/r/Mx9pCx/1, just add (?:AT)? before CCCC. Commented Sep 19, 2017 at 16:32
  • @WiktorStribiżew - Worked like a charm. Can you please add it as answer so that I can accept it.? Commented Sep 19, 2017 at 16:37
  • You might also consider using [0-9]{8}T[0-9]{6} in your regex. That is a little easier to understand ("8 digits, T, 6 digits"). After all, you'd have to further validate the input anyway, to avoid the 14th month, the 37th day of the month, or the 27th hour of the day. Commented Sep 19, 2017 at 17:06
  • It can be made shorter with AAA_BBB_(?:AT)?CCCC_(.*)_\d{4}[01]\d[0-3]\dT[0-2]\d[0-5]\d[0-5]\d Commented Sep 19, 2017 at 17:22
  • I added an answer. Commented Sep 19, 2017 at 17:24

2 Answers 2

1

You just need to add an optional group, (?:AT)?, before CCCC:

AAA_BBB_(?:AT)?CCCC_(.*)_[0-9]{4}[0-1][0-9][0-3][0-9]T[0-2][0-9][0-5][0-9][0-5][0-9] ^^^^^^^ 

See the regex demo

I also contracted the four [0-9] to [0-9]{4} to make the pattern shorter.

The (?:AT)? is a non-capturing group to which a ? quantifier is applied. The ? quantifier makes the whole sequence of letters match 1 or 0 times, making it optional in the end.

Sign up to request clarification or add additional context in comments.

Comments

0

Please give the following regex a try.

AAA_BBB_(ATCCCC|CCCC)_(.*)_[0-9][0-9][0-9][0-9][0-1][0-9][0-3][0-9]T[0-2][0-9][0-5][0-9][0-5][0-9]. 

It would only match ATCCCC or CCCC. It won't be able to support dynamic characters preceding CCCC. You would need to use wildcards for that.

Also, you would need to change your matcher.group(1) statement to matcher.group(2)

1 Comment

Thank you for your tip. However, I didnt want to change the code and hence the ask.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.