1

I've got a problem with parsing an XML file (nb. well formed one).

Consider XML file like this:

<?xml version="1.0" encoding="utf-8" ?> <root> <list> <item no="1"> <title>Item's 1 title</title> <content>Some long content with <special>tags</special> inside</content> </item> <item no="2"> <title>Item's 2 title</title> <content>Some long content with <special>tags</special> inside</content> </item> </list> </root> 

I need to get contents contents of each item in the list and put them in an array. Generally not a problem, but in this case, I can't get my head round it.

Problem lays in <content> contents. It is string with tags in-between. I can't find a way to extract the contents. SimpleXML returns/echoes just the string with anything including and inside <special> tags stripped out. Like this:

Some long content with inside. 

I'd ideally want it to get a string like this:

Some long content with <special>tags</special> inside 

How do I get it?

5
  • 1
    possible duplicate of PHP SimpleXML get innerXML Commented Jun 21, 2011 at 15:48
  • I don't think you're supposed to mix text nodes with other nodes. Ideally your XML should be like <content><![CDATA[Some long content with <special>tags</special> inside]]></content> which instructs parser not to parse content within CDATA tag (return it as is) Commented Jun 21, 2011 at 15:57
  • @mkilmanas Well, that's what an application's API returns, so I have no choice there. Commented Jun 21, 2011 at 15:59
  • @Gordon You might be right. Thanks for the link, will investigate. Commented Jun 21, 2011 at 15:59
  • well, the accepted solution suggests to use a 3rd partly library. Personally, I'm not too fond of those non-native solutions, but that's just me. Anyways, if you want to investigate some more you now know the term: innerXML. Commented Jun 21, 2011 at 16:28

3 Answers 3

3

You could use DOMDocument which is built into PHP.

<?php $xml = <<<END <?xml version="1.0" encoding="utf-8" ?> <root> <list> <item no="1"> <title>Item's 1 title</title> <content>Some long content with <special>tags</special> inside</content> </item> <item no="2"> <title>Item's 2 title</title> <content>Some long content with <special>tags</special> inside</content> </item> </list> </root> END; $doc = new DOMDocument('1.0', 'UTF-8'); $doc->loadXML($xml); $nodes = $doc->getElementsByTagName('content'); foreach ( $nodes as $node ) { $temp_doc = new DOMDocument('1.0', 'UTF-8'); foreach ( $node->childNodes as $child ) $temp_doc->appendChild($temp_doc->importNode($child, true)); echo $temp_doc->saveHTML(); // Outputs: Some long content with <special>tags</special> inside } 

To select the top level "content" elements (in case there are "content" elements inside), you can use DOMXPath.

$doc = new DOMDocument('1.0', 'UTF-8'); $doc->loadXML($xml); // $xml from the example above $xpath = new DOMXPath($doc); $nodes = $xpath->query('/root/list/item/content'); foreach ( $nodes as $node ) { $temp_doc = new DOMDocument('1.0', 'UTF-8'); foreach ( $node->childNodes as $child ) $temp_doc->appendChild($temp_doc->importNode($child, true)); echo $temp_doc->saveHTML(); // Outputs: Some long content with <special>tags</special> inside } 
Sign up to request clarification or add additional context in comments.

2 Comments

nice,what if the text node contain tag 'content'?
@ajreal - You could use DOMXPath to fetch only the top level "content" tag. I'll update my example.
0

SimpleXML just doesn't support mixed content (text nodes with element nodes as siblings). I suggest you use XMLReader instead.

Comments

0

You could use SimpleXML's asXML function. It will return that called node as the xml string;

$xml = simplexml_load_file($file); foreach($xml->list->item as $item) { $content = $item->contents->asXML(); echo $content."\n"; } 

will print:

<content>Some long content with <special>tags</special> inside</content> <content>Some long content with <special>tags</special> inside</content> 

it's a little ugly but you could then clip out the <content> and </content> with a substr:

$content = substr($content,9,-10); 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.