RegEx for capturing everything except numbers and one word

Question

I am quite stuck with a regex I can't get to work. It should capture everything except digits and the word fiktiv (not single characters of it!). Objective is to get rid of this content.

I have tried something like (?!\d|fiktiv).* on my sample string 123456788daswqrt fiktiv

https://regex101.com/r/kU8mF3/1

However this does match the fiktiv at the end as well.

What language are you using. In most programming languages to get rid of the content you'd match it and then replace it with empty string. So for example on the command line (assuming unix): awk '{gsub(/fiktiv/,"");gsub(/[0-9]/,"";print $0}' — slebetman
– slebetman, Commented Aug 16, 2016 at 8:27
I guess you just need .replace(/(fiktiv)|\D/g, "$1") (in JS). What is the regex flavor and what is the expected output? — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Aug 16, 2016 at 8:28
I am using SQL Server which under the cover uses a .NET assembly — Martin Guth
– Martin Guth, Commented Aug 16, 2016 at 8:32
So, can you use Regex.Replace(input, "(fiktiv)|[^0-9]", "$1")? Or are you limited to TSQL toolset? See regex101.com/r/vR4uU0/1 — Wiktor Stribiżew
– Wiktor Stribiżew, Commented Aug 16, 2016 at 8:34

prizm1 · Accepted Answer · 2016-08-16 08:30:49Z

2

One possibility would be to use a neglected character class, which can be used by putting a ^ in [] braces. So you basically say don't match digits, and as many non digits as you can get until a space occurs and the word fiktiv appears.

This capturing will be "saved" in the capturing group 1 for later use.

([^\d]+)\s+fiktiv

Testing could be done here:

https://regex101.com/

answered Aug 16, 2016 at 8:30

prizm1

3731 silver badge11 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Martin Guth Over a year ago

Thanks for your help...but I am still struggling to prevent the word fiktiv getting matched

Wiktor Stribiżew Over a year ago

@MartinGuth: You should not try to avoid matching it, as it will make the regex inefficient and slow, see my approach: regex101.com/r/vR4uU0/1

prizm1 Over a year ago

The concept behind regex to strip out the unneeded characters by having output in so called capturing groups. Even if you explicitly state fiktiv in the regex and it gets matched, the capturing group number 1 will only contain what the () braces are wrapped around, so : [^\d]+ in this example. You need to find out how the output can access the capturing groups.

Wiktor Stribiżew · Accepted Answer · 2016-08-16 09:23:11Z

It should capture everything except digits and the word fiktiv (not single characters of it!). Objective is to get rid of this content.

So, you want to remove any character that is not a digit (that is, \D or [^0-9] pattern) and not a fiktiv char sequence.

You may use a regex with a capturing group and alternation:

(fiktiv)|[^0-9]

and replace with the contents of Group 1 using a $1 backreference, fiktiv, to restore it in the replaced string.

See the regex demo

C# implementation:

Regex.Replace(input‌, "(fiktiv)|[^0-9]", "$1")

Also, see Use RegEx in SQL with CLR Procs.

Collectives™ on Stack Overflow

RegEx for capturing everything except numbers and one word

2 Answers 2

3 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Related