4

In Java, how to get all groups which is inside a group (regular expression).
For example:Using (([A-Z][a-z]+)+)([0-9]+) test a string : "AbcDefGhi12345".
Then get Result:
matches():yes
groupCount():3
group(1):AbcDefGhi
group(2):Ghi
group(3):12345

But I want to get String "Abc", "Def", "Ghi", "12345" respectively.
How can I do that by using regular expression?

3 Answers 3

1

Regular expressions cannot handle repeating groups it can return any of the captured groups (in your case it returned "Ghi").

The example below will print:

Abc Def Ghi 12345 

The code:

public static void main(String[] args) { String example = "AbcDefGhi12345"; if (example.matches("(([A-Z][a-z]+)+)([0-9]+)")) { Scanner s = new Scanner(example); String m; while ((m = s.findWithinHorizon("[A-Z][a-z]+", 0)) != null) System.out.println(m); System.out.println(s.findWithinHorizon("[0-9]+", 0)); } } 
Sign up to request clarification or add additional context in comments.

6 Comments

Does the camelCaseStyle define each group?
@pringlesinn: Yes (the word-groups).
@dacwe: so maybe what I asked a while ago might help stackoverflow.com/questions/4502273/…
-1 if you copy my (and hzh's) answer you should mention it and not add a comment that it would not work.
An RE engine may return the first place it matches each group instead. There's no actual specification for what's produced in that case (and it's a very black art at the RE matcher level anyway).
|
1
Pattern p = Pattern.compile("([A-Z][a-z]+|(?:[0-9]+))"); Matcher m = p.matcher("AbcDefGhi12345"); while(m.find()){ System.out.println(m.group(1)); } 

1 Comment

If matching strings that does not start with [A-Z][a-z]+ is okey it is the best solution!
-1

like hzh's answer with some format and a little bit simpler:

Pattern p = Pattern.compile("[A-Z][a-z]+|[0-9]+"); Matcher m = p.matcher("AbcDefGhi12345"); while(m.find()){ System.out.println(m.group(0)); } 

gives you

Abc Def Ghi 12345 

4 Comments

Not true, when I run it I get the following: 1. Abc 2. Def 3. Ghi 4. 12345
Yes, it still matches the string "12345" alone which is not ok.
Why is it not ok???? The OP never required it to match only strings that are in some kind of format. And giving me a -1 for that... hmmm.
If so, putting a if(string.matches("([A-Z][a-z]+)+[0-9]+")) {} around will help of course.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.