How to find out if string contains non-alpha numeric characters in C#/.NET 2.0?

Question

Allowed characters are (at least) A-Z, a-z, 0-9, ö, Ö, ä, ä, å, Å and german, latvian, estonian (if any) special chars? Is there ready-made method or do i have to make blacklist (non-allowed chars) and regular expressions IsMatch? If no ready-made how to use blacklist?

Another thread having answers to consider stackoverflow.com/questions/2371780 Might provide additional insight. — John K
– John K, Commented Jun 17, 2010 at 12:57
possible duplicate of .net Regular Expression to match any kind of letter from any language — GvS
– GvS, Commented Jun 17, 2010 at 12:59

Guffa · Accepted Answer · 2010-06-17 13:05:11Z

I don't know how special characters from all those languages are categorised, but you could check if the Char.IsLetterOrDigit method matches what you want to do. It works at least for the digits and letters I tested:

string test = "Aasdf345ÅÄÖåäöéÉóÓüÜïÏôÔ"; if (test.All(Char.IsLetterOrDigit)) { ... }

The Char.IsLetterOrDigit returns true for characters that are categorised in Unicode as UppercaseLetter, LowercaseLetter, TitlecaseLetter, ModifierLetter, OtherLetter, or DecimalDigitNumber.

What's test.All? That's not a string method, is it some kind of extension method? Or a LINQ method?
@Task All is a linq extension of string. See msdn.microsoft.com/en-us/library/system.string.aspx
Ah! The "Extension Methods" section of the documentation is new to me, I hadn't seen that before. I guess I've gotten too used to finding everything I need in the earlier "Properties" or "Methods" area. Thanks!

Flynn1179 · Accepted Answer · 2010-06-26 21:09:40Z

13

Investigate char.IsLetterOrDigit(char).

For example:

myString.All(c => char.IsLetterOrDigit(c));

edited Jun 26, 2010 at 21:09

answered Jun 17, 2010 at 12:47

Flynn1179

12.1k6 gold badges40 silver badges74 bronze badges

3 Comments

Flynn1179 Over a year ago

Just curious, but why was this downvoted? As far as I can tell it's a perfectly valid way of doing what the OP asked.

Flynn1179 Over a year ago

Ah.. just had a closer look; never noticed the 0-9 requirement in there. I've amended my answer to use IsLetterOrDigit instead of just IsLetter.

gls123 Over a year ago

A shorthand for this is myString.All(char.IsLetterOrDigit);

Joey · Accepted Answer · 2010-06-17 13:35:12Z

A blacklist for characters is likely pretty large :-)

You can use the regular expression

^[\d\p{L}]+$

to match decimal digits and letters, regardless of script.

This regular expression consists of a character class containing the shorthands \d – which contains every digit (230 in total in the BMP) and \p{L} which contains every Unicode character classified as a "letter" (46817 in the BMP). Said character class is then repeated at least once and embedded between ^ and $ – the string start and end anchors, so it matches the complete string.

For some regex engines, since you're only interested in Latin letters, apparently, you could also use

^[\d\p{Letter}]+$

However, .NET doesn't support this. The first regex mentioned above actually catches everything that's a digit or a letter in any script. So it will dutifully match on Indian or Arabic numerals and Hebrew, Cyrillic and other non-Latin scripts. Depending on what you want this may not be appropriate.

If that poses a problem, then I see no better option than to explicitly list the characters you want to allow. However, I consider it dangerous to assume that text in a certain language is always restricted to that language's script. If I were to write a Czech or Polish name in a German text, then I'd likely need more than just [a-zA-ZäöüÄÖÜß].

thanks! can you please explain how ^[\d\p{L}]+$ works. I checked from the web but I couldn't sum it up entirely...

Brook Julias · Accepted Answer · 2010-06-17 12:49:20Z

0

It would be simpler to match allowed characters catch a false return.

answered Jun 17, 2010 at 12:49

Brook Julias

2,1059 gold badges29 silver badges44 bronze badges

Collectives™ on Stack Overflow

How to find out if string contains non-alpha numeric characters in C#/.NET 2.0?

4 Answers 4

3 Comments

3 Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

3 Comments

1 Comment

Comments

Linked

Related