0

A string is comprised of the following:

  • An optional sequence of ASCII digits.
  • A sequence of ASCII lowercase letters.

I'm trying to do the split in one single regex that I could use like this:

String string = "123abc"; var array = string.split(...); System.out.println(java.util.Arrays.toString(array)); // prints [123, abc] 

The closest regex I've come to is the following:

(?<=\d+) 

Example:

String string = "123abc"; var array = string.split("(?<=\\d+)"); System.out.println(java.util.Arrays.toString(array)); // prints [1, 2, 3, abc] 

Technically, I could do this without any regex, but here, it's important to be done with regex.

A solution to prove I can do it normally:

String string = "123abc"; int i = 0; for(; i < string.length() && Character.isDigit(string.charAt(i)); i++) ; String[] array = { string.substring(0, i), string.substring(i) }; System.out.println(java.util.Arrays.toString(array)); // prints [123, abc] 

Another way of doing it would be:

String string = "123abc"; String[] array = { string.replaceAll("\\D", ""), string.replaceAll("\\d", "") }; System.out.println(java.util.Arrays.toString(array)); // prints [123, abc] 

Matching examples:

In: Out: 123abc [ "123", "abc" ] 0a [ "0", "a" ] a [ "", "a" ] 
7
  • @WiktorStribiżew Thanks but it doesn't work for "abc".split(...) which returns [ "abc" ], not [ "", "abc" ]. Commented Dec 20, 2018 at 19:36
  • Then match with "(\\d*)(\\D+)" or "(\\d*)(\\D*)" and get Group 1 and 2 Commented Dec 20, 2018 at 19:38
  • Is it also important to use .split()? If I wanted to use regex for this, I'd use matching groups instead: string.matches("(\d+)([a-z]+)") Commented Dec 20, 2018 at 19:39
  • @GeoffreyWiseman Yes, it's important: I'm actually code-golfing something and split is usually the best way to win huge amounts of bytes. I've already golfed everything else but that part, and even with a 20-characters regex, I'd still gain bytes in the end. So yes, there should only be a split. As shown in the solutions that work without regex or without split, I have no issue to make it work without split. I'm only interested in a split solution. Commented Dec 20, 2018 at 19:43
  • 1
    Pattern.split() documentation says When there is a positive-width match at the beginning of the input sequence then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.. Commented Dec 20, 2018 at 21:50

1 Answer 1

1

Pattern.split() documentation says:

When there is a positive-width match at the beginning of the input sequence then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.

So what you are trying to achieve may not be possible with Regular Expression.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.