0

I have input search use preg_replace, however I wish my search input accept others language

Keep - Chinese, Japaneses, German.. etc.

Remove - symbol character like @#$%^*() those

This one only keep english preg_replace("/[^a-zA-Z0-9]+/", "", $search);

any way to set up for multi language?

2
  • So, what do you want to keep and what do you want to remove? May it be easier to use a blacklist rather than a whitelist? Commented Jun 12, 2013 at 7:38
  • To regex/php, æ is as much of a unicode character as ^. Your best bet it to make a blacklist, and simply replace with that. (preg_replace("/[!\"#¤%&\(\)=/]+", "", $search);). Commented Jun 12, 2013 at 7:39

2 Answers 2

3

Though for java, a concise overview here.

You can use the so called Posix notation:

[^\p{Alnum}\p{M}] 

The first is the alphanumeric group, and the second the combining diacritical marks: the accents. The latter should not be forgotten because one can write ĉ as one Unicode point c-circumflex, but also as 'c' followed by a combining circumflex ^ (zero width, here represented by the normal circumflex). In some languages there are more than one marks to a base letter.


Correction:

[^\p{L}\p{N}\p{M}] 
Sign up to request clarification or add additional context in comments.

3 Comments

The php reference for the same: php.net/manual/en/regexp.reference.unicode.php
What to do for this if i need to do @Hello I want this for use mention feature
@SandipJha I am unsure what you want. As also this question is closed, just ask a question yourself.
1

Here's the PHP unicode regex reference. The plus + isn't necessary as PHP will loop through the string. The \s will match all whitespace characters.

preg_replace("![^\p{L}\p{N}\s]!", "", $search); 

If you want to match only the space character itself you would add it in the brackets as a literal:

preg_replace("![^\p{L}\p{N} ]!", "", $search); 

Update Added bit about spaces per comment request

4 Comments

how to keep space? " "
I've added the whitespace character match \s.
one more question, when I out put, it become symbol
That has to do with the encodings. stackoverflow.com/questions/1696619/…

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.