0

Is it possible to wrap each word on HTML page with span element? I'm trying something like

/(\s*(?:<\/?\w+[^>]*>)|(\b\w+\b))/g 

but results far from what I need.

Thanks in advance!

4
  • 6
    You really shouldn't parse HTML with regex Commented Aug 21, 2011 at 21:28
  • 2
    You can't parse HTML with regex, only Chuck Norris can. stackoverflow.com/questions/1732348/… Commented Aug 21, 2011 at 21:29
  • 2
    Of course you can use regexes to parse HTML. In fact, some times you even should. However, Javascript has some of the most horrible regexes of any programming language anywhere. The XRegExp plugin helps, but it still sucks. It's easier to teach a pig to sing, and less annoying.Either do all Real™ work serverside where you can use a Real™ programming language, or else be prepared to improvise a 6-voice fugue for unaccompanied porcine chorus. Commented Aug 22, 2011 at 0:16
  • Thanks guys, it seems I need to look in a direction of getting all text nodes and working with them. Commented Aug 22, 2011 at 7:45

5 Answers 5

2

Well, I don't ask for the reason, you could do it like this:

function getChilds( nodes ) { var len = nodes.length; while( len-- ) { if( nodes[len].childNodes && nodes[len].childNodes.length ) { getChilds( nodes[len].childNodes ); } var content = nodes[len].textContent || nodes[len].text; if( nodes[len].nodeType === 3 ) { var parent = nodes[len].parentNode, newstr = content.split(/\s+/).forEach(function( word ) { var s = document.createElement('span'); s.textContent = word + ' '; parent.appendChild(s); }); parent.removeChild( nodes[len] ); } }; } getChilds( document.body.childNodes ); 

Even tho I have to admit I didn't test the code yet. That was just the first thing which came to my mind. Might be buggy or screw up completely, but for that case I know the gentle and kind stackoverflow community will kick my ass and downvote like hell :-p

Sign up to request clarification or add additional context in comments.

3 Comments

Why this line: var each = Array.prototype.forEach;? there doesn't seem to be a point to it.
Yeah, first line is confusing, could you explain this? Anyway, with some modification this solved my problem. Thanks!
@Brock: yay you're right. Thats a hangover from a further version. I'll remove it.
2

You're going to have to get down to the "Text" nodes to make this happen. Without making it specific to a tag, you really to to traverse every element on the page, wrap it, and re-append it.

With that said, try something like what a garble post makes use of (less making fitlers for words with 4+ characters and mixing the letters up).

1 Comment

That was a fun topic, wasn't it?
1

To get all words between span tags from current page, you can use:

var spans = document.body.getElementsByTagName('span'); if (spans) { for (var i in spans) { if (spans[i].innerHTML && !/[^\w*]/.test(spans[i].innerHTML)) { alert(spans[i].innerHTML); } } } else { alert('span tags not found'); } 

1 Comment

My understanding is not to filter based on if they're already in a span, but to make every word itself get wrapped in a new span. ...maybe I'm misinterpreting?
1

You should probably start off by getting all the text nodes in the document, and working with their contents instead of on the HTML as a plain string. It really depends on the language you're working with, but you could usually use a simple XPath like //text() to do that.

In JavaScript, that would be document.evaluate('//text()', document.body, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null), than iterating over the results and working with each text node separately.

Comments

1

See demo

Here's how I did it, may need some tweaking...

var wrapWords = function(el) { var skipTags = { style: true, script: true, iframe: true, a: true }, child, tag; for (var i = el.childNodes.length - 1; i >= 0; i--) { child = el.childNodes[i]; if (child.nodeType == 1) { tag = child.nodeName.toLowerCase(); if (!(tag in skipTags)) { wrapWords(child); } } else if (child.nodeType == 3 && /\w+/.test(child.textContent)) { var si, spanWrap; while ((si = child.textContent.indexOf(' ')) >= 0) { if (child != null && si == 0) { child.splitText(1); child = child.nextSibling; } else if (child != null) { child.splitText(si); spanWrap = document.createElement("span"); spanWrap.innerHTML = child.textContent; child.parentNode.replaceChild(spanWrap, child); child = spanWrap.nextSibling; } } if (child != null) { spanWrap = document.createElement("span"); spanWrap.innerHTML = child.textContent; child.parentNode.replaceChild(spanWrap, child); } } } }; wrapWords(document.body); 

See demo

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.