1

I'm trying to write a regex to match the following text:

  • one of letters f, s or t
  • multiple times:
    • at least one digit
    • one of letters f, s or t

This is the regex I've come up with: (f|s|t)(\d+(f|s|t))*, but it doesn't seem to work.

Input: f20s30t

Expected matches:

  • f
  • 20s
  • 30t

Actual matches:

  • f
  • 30t
  • t

Why is my regex expression wrong?

Edit: I'm using this method to split the input.

10
  • 3
    By "matches", you meant captures, right? It is not quite clear what you are doing. You may actually just get multiple matches with \d*[fst]. In .NET, you may grab all the captures by accessing the capture collection related to a certain group. Is it a whole string or part of a longer text? Commented May 17, 2018 at 15:20
  • @WiktorStribiżew I'm using Regex split in C#, but I think that's kind of irrelevant to the question. You can test the regex yourself in regex101.com Commented May 17, 2018 at 15:24
  • Oh, that is very relevant because you should not test the .NET regex patterns at regex101 as it does not support .NET regex. So, is that a whole string or part of a longer text? Commented May 17, 2018 at 15:25
  • I think you are rather interested in match groups. Are you? You've given only one example - not sufficient when you're asking strangers about a regex. Give a couple of more positive examples as well as negative examples. Commented May 17, 2018 at 15:26
  • @WiktorStribiżew I edited the question, thanks Commented May 17, 2018 at 15:26

3 Answers 3

1

Since you are using C#, you may match a string with a pattern and access the capture collection of each repeated capturing group:

var s = "f20s30t"; var m = Regex.Match(s, @"^([fst])(\d+[fst])*$"); if (m.Success) { Console.WriteLine(m.Groups[1].Value); foreach (var g in m.Groups[2].Captures.Cast<Capture>().Select(t => t.Value)) Console.WriteLine(g); } 

The advantage of this approach is that it also validates the string, and you won't get results for a string like TEXT random f20s30t here.....

See the C# demo, output

f 20s 30t 

Here is the regex demo:

enter image description here

Details

  • ^ - start of the string
  • ([fst]) - Capturing group 1: f, s or t
  • (\d+[fst])* - 0 or more repetitions (captured into Group 2 with each value saved in the group stack) of:
    • \d+ - 1+ digits
    • [fst] - f, s or t
  • $ - end of string.
Sign up to request clarification or add additional context in comments.

Comments

1

Going by your example code:

^([fst]|[0-9]+[fst])$ 
  • ^: start of string
  • [fst]: one of f, s or t
  • |: OR
  • [0-9]+: one or more digits
  • [fst]: one of f, s or t
  • $: end of string

In C#:

using System; using System.Text.RegularExpressions; public class Program { public static void Main() { string input = "f20s30t"; string pattern = "([fst]|[0-9]+[fst])"; string[] substrings = Regex.Split(input, pattern); foreach (string match in substrings) { Console.WriteLine("'{0}'", match); } } } // //f // //20s // //30t 

.NET Fiddle

Regex101

Comments

1

In regex world (f|s|t) is equal to [fst] but has a few downsides in comparison. So use latter (a character class) in place of former. Using split method you could split on this:

(?<=[fts])(?=\d+[fts]) 

Above regex matches a position where a character from [fts] meets an occurrence of digits following a character from [fts]. This (?<=...) is a positive look-behind and (?=...) constructs a positive look-ahead.

RegEx live demo

Code (see demo here):

string input = "f20s30t"; string pattern = @"(?<=[fts])(?=\d+[fts])"; string[] substrings = Regex.Split(input, pattern); foreach (string match in substrings) { Console.WriteLine("{0}", match); } 

Output:

f 20s 30t 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.