Splitting a string into two with its first, digit-only part and the rest?

Question

A string is comprised of the following:

An optional sequence of ASCII digits.
A sequence of ASCII lowercase letters.

I'm trying to do the split in one single regex that I could use like this:

String string = "123abc"; var array = string.split(...); System.out.println(java.util.Arrays.toString(array)); // prints [123, abc]

The closest regex I've come to is the following:

(?<=\d+)

Example:

String string = "123abc"; var array = string.split("(?<=\\d+)"); System.out.println(java.util.Arrays.toString(array)); // prints [1, 2, 3, abc]

Technically, I could do this without any regex, but here, it's important to be done with regex.

A solution to prove I can do it normally:

String string = "123abc"; int i = 0; for(; i < string.length() && Character.isDigit(string.charAt(i)); i++) ; String[] array = { string.substring(0, i), string.substring(i) }; System.out.println(java.util.Arrays.toString(array)); // prints [123, abc]

Another way of doing it would be:

String string = "123abc"; String[] array = { string.replaceAll("\\D", ""), string.replaceAll("\\d", "") }; System.out.println(java.util.Arrays.toString(array)); // prints [123, abc]

Matching examples:

In: Out: 123abc [ "123", "abc" ] 0a [ "0", "a" ] a [ "", "a" ]

@WiktorStribiżew Thanks but it doesn't work for "abc".split(...) which returns [ "abc" ], not [ "", "abc" ]. — Olivier Grégoire
– Olivier Grégoire, Commented Dec 20, 2018 at 19:36
Then match with "(\\d*)(\\D+)" or "(\\d*)(\\D*)" and get Group 1 and 2 — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Dec 20, 2018 at 19:38
Is it also important to use .split()? If I wanted to use regex for this, I'd use matching groups instead: string.matches("(\d+)([a-z]+)") — Geoffrey Wiseman
– Geoffrey Wiseman, Commented Dec 20, 2018 at 19:39
@GeoffreyWiseman Yes, it's important: I'm actually code-golfing something and split is usually the best way to win huge amounts of bytes. I've already golfed everything else but that part, and even with a 20-characters regex, I'd still gain bytes in the end. So yes, there should only be a split. As shown in the solutions that work without regex or without split, I have no issue to make it work without split. I'm only interested in a split solution. — Olivier Grégoire
– Olivier Grégoire, Commented Dec 20, 2018 at 19:43
Pattern.split() documentation says When there is a positive-width match at the beginning of the input sequence then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.. — Venkata Raju
– Venkata Raju, Commented Dec 20, 2018 at 21:50

Venkata Raju · Accepted Answer · 2018-12-20 21:55:56Z

Pattern.split() documentation says:

When there is a positive-width match at the beginning of the input sequence then an empty leading substring is included at the beginning of the resulting array. A zero-width match at the beginning however never produces such empty leading substring.

So what you are trying to achieve may not be possible with Regular Expression.

Collectives™ on Stack Overflow

Splitting a string into two with its first, digit-only part and the rest?

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related