23

How do I get the text of an element without the children? Neither element.textContent nor element.innerText seem to be working.

HTML:

<body> <h1>Test Heading</h1> <div> Awesome video and music. Thumbs way up. Love it. Happy weekend to you and your family. Love, Sasha </div> </body> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.4/jquery.min.js"></script> <script type="text/javascript"> fool("body"); </script> 

and here's the fool function:

jQuery.fn.justtext = function(text) { return $(this).clone() .children() .remove() .end() .text(); }; function fool(el) { reverse(el); function reverse(el) { $(el).children().each(function() { if($(this).children().length > 0) { reverse(this); if($(this).justtext() != "") reverseText(this); } else { reverseText(this) } }); } function reverseText(el){ var text = el.textContent; var frag = text.toString().split(/ /); var foo = ""; var punctation_marks = [".",",","?","!"," ",":",";"]; for(i in frag){ if(punctation_marks.indexOf(frag[i]) == -1) foo += actualReverse(frag[i],punctation_marks) + " "; } el.textContent = foo; } function actualReverse(text,punctation_marks) { return (punctation_marks.indexOf(text.split("")[text.split("").length-1]) != -1)?text.split("").slice(0,text.split("").length-1).reverse().join("") + text.split("")[text.split("").length-1] : text.split("").reverse().join(""); } } 

edit: using node.nodeType doesn't really help and here's why: Imaginge the following HTML

<td class="gensmall"> Last visit was: Sat Mar 31, 2012 10:50 am <br> <a href="./search.php?search_id=unanswered">View unanswered posts</a> | <a href="./search.php?search_id=active_topics">View active topics</a> </td> 

if I'd use nodeType, only the text of the a element would change , but not the td itself ("last visit....")

7
  • Any code? Your selector is probably wrong, element does not conain any content when .textContent and innerText are empty. Commented Mar 31, 2012 at 12:47
  • What exactly do you mean by, "without children"? Commented Mar 31, 2012 at 13:07
  • @Pointy I only wanna have (as related to the last example - the td cell) the "Last visit was: Sat Mar 31, 2012 10:50 am" without the text from the anchors Commented Mar 31, 2012 at 13:17
  • So you want the text content of a node and all its descendant nodes? See the "without children" part made me think that you wanted to skip descendants. Commented Mar 31, 2012 at 13:18
  • No, I DON'T want the text + descendant but only the text Commented Mar 31, 2012 at 13:23

5 Answers 5

30

Just find the text nodes:

var element = document.getElementById('whatever'), text = ''; for (var i = 0; i < element.childNodes.length; ++i) if (element.childNodes[i].nodeType === Node.TEXT_NODE) text += element.childNodes[i].textContent; 

edit — if you want the text in descendant ("children") nodes, and (as is now apparent) you're using jQuery:

$.fn.allText = function() { var text = ''; this.each(function() { $(this).contents().each(function() { if (this.nodeType == Node.TEXT_NODE) text += this.textContent; else if (this.nodeType == Node.ELEMENT_NODE) text += $(this).allText(); }); }); return text; }; 

Hold on and I'll test that out :-) (seems to work)

Sign up to request clarification or add additional context in comments.

7 Comments

I suggesst using Node.TEXT_NODE instead of "3" as it's more readable.
might also have to recurse through non-text nodes to find text nodes they contain. (e.g.: table cells)
@David-SkyMesh: That's exactly my problem. I edited my question
@Pointy has the answer more or less correct. Write a function called getText() that does what he wrote (the FOR and IF), but add an ELSE case to the IF which calls getText recursively on elements.childNodes[i]
If the elements of the non-text nodes provide structure to the text (e.g: table rows) you may wish to interpret those also (e.g: insert a "\n" in your result text).
|
8

The text of an element is also a separate node. Consider this piece of code:

<span> Some text <span>Inner text</span> More text <span>More inner text</span> Even more text </span> 

What do you mean now when you say you want the text of the element? Just the direct children?

Then this piece of code may help:

for (const element of elements) { if (element.nodeType == Node.TEXT_NODE) { // do something } } 

Comments

7

This code achieves the same result as the two other answers, but in a more expressive, functional way. The filter and map array methods are supported in all modern browsers (IE9 and up).

Throwing this in there since the other answers are a bit dated by now.

var content = Array.prototype.filter.call(element.childNodes, function (element) { return element.nodeType === Node.TEXT_NODE; }).map(function (element) { return element.textContent; }).join(""); 

1 Comment

Using arrow functions: Array.prototype.filter.call(ELEMENT.childNodes, e => e.nodeType === Node.TEXT_NODE).map(e => e.textContent).join('');
2

In addition to answers like Pointy, handling newline character for <br/> can be done like this:

txt = ''; for (var i = 0; i < element.childNodes.length; ++i) if (element.childNodes[i].nodeType == 3) { txt += element.childNodes[i].textContent; } else if (element.childNodes[i].nodeType == 1) { name = element.childNodes[i].nodeName || element.childNodes[i].tagName || ''; if (name.toUpperCase() == 'BR') { txt += '\n'; } } return txt; 

Comments

1

You can filter and reduce to get text nodes without child nodes:

[...$("#myelement").childNodes] .filter(node=>node.nodeType==3) .reduce((acc,node)=>acc+node.textContent.trim(), "") 

Given the following HTML:

<html> <body> <div id="myelement"> First Text <span id="nestedelement">Extra Text</span> Last Text </div> </body> </html> 

Let's break this down.

First, find your HTMLElement using a query selector:

/* css selector to find element by ID */ var locator = "#myelement" var element = document.querySelector(locator) /* or use the shorthand for querySelector */ element = $(locator) console.log("element", element) /* This include the extra text we don't want */ console.log("innerText", element.innerText) 

Get children as a NodeList, and convert to an Array so we can iterate over it

var nodes = Array.from(element.childNodes) /* or use the "spread" operator */ nodes = [...element.childNodes] 

Print the content of each text node by checking nodeType

for (node of nodes) { if (node.nodeType == 3) { var nodeText = node.textContent.trim() console.log("each node text: ", nodeText) } } 

You can stop here if you want, since you can get everything you want from this. But let's look at using "functional" patterns to do the same thing.

Iterate using Array.forEach() and acccumulate the content of text nodes with a "closure" (fancy word for variable in outer scope)

var textNodes = [] nodes.forEach(node => { if (node.nodeType == 3) { /* TEXT_NODE == 3 */ textNodes.push(node.textContent.trim()) } }) console.log(textNodes.join("\n")) /* print a newline between text nodes */ 

Use Array.find() to get only the first text node

var firstNode = nodes.find(node => node.nodeType == Node.TEXT_NODE) console.log(firstNode.textContent.trim()) 

Use Array.findLast() to get only the last text node

var lastNode = nodes.findLast(node=> node.nodeType == Node.TEXT_NODE) console.log("lastNode: " + lastNode.textContent.trim()) 

Use Array.filter() to get all text nodes

var textNodes = nodes.filter(node => node.nodeType == Node.TEXT_NODE) 

And then use Array.reduce() to combine the text nodes

var text = textNodes.reduce( (acc, textNode) => acc + textNode.textContent.trim(), "") /* start accumulator with an empty string */ 

Working snippet

/* find your HTMLElement */ var locator = "div#myelement" /* css selector to find element by ID*/ var element = document.querySelector(locator) /* or use shorthand for querySelector */ // element = $(locator) console.log("element: ", element) console.log("innerText: ", element.innerText) /* this includes the extra text we don't want */ /* get children as NodeList, and convert to Array so we can iterate over it */ var nodes = Array.from(element.childNodes) /* or use the "spread" operator */ nodes = [...element.childNodes] /* print each text node by checking nodeType */ for (node of nodes) { if (node.nodeType == 3) { console.log("each text node: ", node.textContent.trim()) /* remove extra whitespace including tabs and nelines */ } } /* you can stop here if you want, since you can have everything you want */ /* but let's look at using "functional" patterns to do the same thing */ /* accumulate content of text nodes with a "closure" (fancy word for variable in outer scope) */ var textNodes = [] nodes.forEach(node => { if (node.nodeType == Node.TEXT_NODE) { /* TEXT_NODE == 3 */ textNodes.push(node.textContent.trim()) } }) console.log("combined textNodes:", textNodes.join("\n")) /* print a newline between text nodes */ /* use Array.find() to get the first text node */ var firstNode = nodes.find(node => node.nodeType == Node.TEXT_NODE) console.log("firstNode: ", firstNode.textContent.trim()) /* use Array.findLast() to get the last text node */ var lastNode = nodes.findLast(node=> node.nodeType == Node.TEXT_NODE) console.log("lastNode: ", lastNode.textContent.trim()) /* use Array.filter() to get all text nodes */ var textNodes = nodes.filter(node => node.nodeType == Node.TEXT_NODE) /* use Array.reduce() to combine the text nodes */ combinedText = textNodes.reduce( (acc, textNode) => acc + textNode.textContent.trim(), "") /* start with an empty string */ console.log("combined text: ", combinedText)
 <html> <body> <div id="myelement"> First Text <span id="nestedelement">Extra Text</span> Last Text </div> </body> </html>

console.log("combined text: ", text) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.