0

I try to find and stock in HTML paragraphe all the word.

Actually, I have a function like this

p.html(function(index, oldHtml) { return oldHtml.replace(/\b(\w+?)\b/g, '<span>$1</span>'); }); 

But it's only return word without accent. I test on regex101.com https://www.regex101.com/r/jS5gW6/1

Any idea ?

3
  • 2
    you need to use unicodes.. Commented Dec 23, 2014 at 14:14
  • Do you need to find only words containing accents, or all words? Note that part of your problem is that \w does not recognize accented characters as 'word' characters, and another part is that \b internally uses the definition of \w to scan for word boundaries. So, even adding é and ç to a class with \w does not solve everything. Commented Dec 23, 2014 at 14:19
  • @Jongware I need to find all words. Commented Dec 23, 2014 at 14:53

1 Answer 1

4

Use a character class:

oldHtml.replace(/([\wàâêëéèîïôûùüç]+)/gi, '<span>$1</span>'); 

Trying it:

var oldHtml = 'kjh À ùp géçhj ùù Çfg'; var res = oldHtml.replace(/([\wàâêëéèîïôûùüç]+)/gi, '<span>$1</span>'); 

gives

"<span>kjh</span> <span>À</span> <span>ùp</span> <span>géçhj</span> <span>ùù</span> 

Çfg"

Sign up to request clarification or add additional context in comments.

2 Comments

How odd -- but thanks for clarifying! I assumed i would not 'ignore the case' because JS's regex does not support accented characters. I suppose it uses the general toLowerCase() underneath, then.
thank you, it's work... Regex is complicate :-)

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.