JavaScript regular expressions to match no digits, whitespace and selected symbols

Question

Thanks for taking a look.

My goal is to come up with a regexp that will match input that contains no digits, whitespace or the symbols !@£$%^&*()+= or any other symbol I may choose.

I am however struggling to grasp precisely how regular expressions work.

I started out with the simple pattern /\D/, which from my understanding will match the first non-digit character it can find. This would match the string 'James' which is correct but also 'James1' which I don't want.

So, my understanding is that if I want to ensure that a pattern is not found anywhere in a given string, I use the ^ and $ characters, as in /^\D$/. Now because this will only match a single character that is not a digit, I needed to use + to specify that 1 or more digits should not be founds in the entire string, giving me the expression /^\D+$/. Brilliant, it no longer matches 'James1'.

Question 1

Is my reasoning up to this point correct?

The next requirement was to ensure no whitespace is in the given string. \s will match a single whitespace and [^\s] will match the first non-whitespace character. So, from my understanding I just had to add this to what I have already to match strings that contain no digits and no whitespace. Again, because [^\s] will only match a single non-white space character, I used + to match one or more whitespace characters, giving the new regexp of /^\D+[^\s]+$/.

This is where I got lost, as the expression now matches 'James1' or even 'James Smith25'. What? Massively confused at this point.

Question 2

Why is /^\D+[^\s]+$/ matching strings that contain spaces?

Question 3

How would I go about writing the regular expression I'm trying to solve?

While I am keen to solve the problem I am more interested in figuring where my understanding of regular expressions is lacking, so any explanations would be helpful.

ahri · Accepted Answer · 2015-06-26 14:35:09Z

Not quite; ^ and $ are actually "anchors" - they mean "start" and "end", it's actually a little more complicated, but you can consider them to mean the start and end of a line for now - look up the various modifiers on regular expressions if you're interested in learning more about this. Unfortunately ^ has an overloaded meaning; if used inside square brackets it means "not", which is the meaning you are already acquainted with. It's very important that you understand the difference between these two meanings and that the definition in your head actually applies only to character range matching!

Contributing further to your confusion is that \d means "a numerical digit" and \D means "not a numerical digit". Similarly \s means "a whitespace (space/tab/newline/etc.) character" and \S means "not a whitespace character."

It's worth noting that \d is effectively a shortcut for [0-9] (note that - has a special meaning inside square brackets), and \D is a shortcut for [^0-9].
The reason it's matching strings that contain spaces is that you've asked for "1+ non-numerical digits followed by 1+ non-space characters" - so it'll match lots of strings! I think that perhaps you don't understand that regular expressions match bits of strings, you're not adding constraints as you go, but rather building up bots of matchers that will match bits of corresponding strings.
/^[^\d\s!@£$%^&*()+=]+$/ is the answer you're looking for - I'd look at it like this:

i. [] - match a range of characters

ii. []+ - match one or more of that range of characters

iii. [^\d\s]+ - match one or more characters that do not match \d (numerical digit) or \s (whitespace)

iv. [^\d\s!@£$%^&*()+=]+ - here's a bunch of other characters I don't want you to match

v. ^[^\d\s!@£$%^&*()+=]+$ - now there are anchors applied, so this matcher has to apply to the whole line otherwise it fails to match

A useful website to explore regexs is http://regexr.com/3b9h7 - which I supply with my suggested solution as an example. Edit: Pruthvi Raj's link to debuggerx is awesome!

Wiktor Stribiżew · Accepted Answer · 2015-06-26 16:05:07Z

Is my reasoning up to this point correct?

Almost. /\D/ matches any character other than a digit, but not just the first one (if you use g option).

and [^\s] will match the first non-whitespace character

Almost, [^\s] will match any non-whitespace character, not just the first one (if you use g option).

/^\D+[^\s]+$/ matching strings that contain spaces?

Yes, it does, because \D matches a space (space is not a digit).

Why is /^\D+[^\s]+$/ matching strings that contain spaces?

Because \D+ in /^\D+[^\s]+$/can match spaces.

Conclusion:

Use

^[^\d\s!@£$%^&*()+=]+$

It will match strings that have no digits and spaces, and the symbols you do not allow.

Mind that to match a literal -, ] or [ with a character class, you either need to escape them, or use at the start or end of the expression. To play it safe, escape them.

Looks like I overlooked the !@£$%^&*()+= requirement, just added them to the final solution. I also hope my final note will be of value for you.

Pruthvi Raj · Accepted Answer · 2015-06-27 08:25:50Z

Just insert every character you don't want to include in a negated character class as follows:

^[^\s\d!@£$%^&*()+=]*$

DEMO

Regular expression visualization

Debuggex Demo

^ - start of the string [^...] - matches one character that is not in `...` \s - matches a whitespace (space, newline,tab) \d - matches a digit from 0 to 9 * - a quantifier that repeats immediately preceeding element by 0 or more times

so the regex matches any string that has

1. string that has a beginning 2. containing 0 or more number of characters that is not whitesapce, digit, and all the symbols included in the character class ( In this example !@£$%^&*()+=) i.e., characters that are not included in the character class `[...]` 3.that has ending

NOTE:

If the symbols you don't want it to have also includes - , a hyphen, don't put it in between some other characters because it is a metacharacter in character class, put it at last of character class

Collectives™ on Stack Overflow

JavaScript regular expressions to match no digits, whitespace and selected symbols

3 Answers 3

Comments

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Related