0

-

Hello Everyone,

I'm trying to access data in a XML file:

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://dublincore.org/documents/dcmi- namespace/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd";> <responseDate>2013-04-15T12:14:31Z</responseDate> <ListRecords> <record> <header> <identifier> a1b31ab2-9efe-11df-9922-efbb156aa6c1:01442b82-59a4-627e-800f-c63de74fc109 </identifier> <datestamp>2012-08-16T14:42:52Z</datestamp> </header> <metadata> <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd";> <dc:description>...</dc:description> <dc:date>1921</dc:date> <dc:identifier>K11510</dc:identifier> <dc:source>Waterschap Vallei & Eem</dc:source> <dc:source>...</dc:source> <dc:source>610</dc:source> <dc:coverage>Bunschoten</dc:coverage> <dc:coverage>Veendijk</dc:coverage> <dc:coverage>Spakenburg</dc:coverage> </oai_dc:dc> </metadata> <about>...</about> </record> 

This a a example of the XML.

I need to access data like dc:date dc:source etc.

Anyone any ideas?

Best regards, Tim

-- UPDATE --

I'm now trying this:

foreach( $xml->ListRecords as $records ) { foreach( $records AS $record ) { $data = $record->children( 'http://www.openarchives.org/OAI/2.0/oai_dc/' ); $rows = $data->children( 'http://purl.org/dc/elements/1.1/' ); echo $rows->date; break; } break; } 
5

4 Answers 4

3

You have nested elements that are in different XML namespaces. In concrete you have got two additional namespaces involved:

$nsUriOaiDc = 'http://www.openarchives.org/OAI/2.0/oai_dc/'; $nsUriDc = 'http://purl.org/dc/elements/1.1/'; 

The first one is for the <oai_dc:dc> element which contains the second ones * <dc:*>* elements like <dc:description> and so on. Those are the elements you're looking for.

In your code you already have a good nose how this works:

$data = $record->children( 'http://www.openarchives.org/OAI/2.0/oai_dc/' ); $rows = $data->children( 'http://purl.org/dc/elements/1.1/' ); 

However there is a little mistake: the $data children are not children of $record but of $record->metadata.

You also do not need to nest two foreach into each other. The code example:

$nsUriOaiDc = 'http://www.openarchives.org/OAI/2.0/oai_dc/'; $nsUriDc = 'http://purl.org/dc/elements/1.1/'; $records = $xml->ListRecords->record; foreach ($records as $record) { $data = $record->metadata->children($nsUriOaiDc); $rows = $data->children($nsUriDc); echo $rows->date; break; } /** output: 1921 **/ 

If you are running into problems like these, you can make use of $record->asXML('php://output'); to show which element(s) you are currently traversing to.

Sign up to request clarification or add additional context in comments.

2 Comments

I had the same problem, thanks so much for posting this solution. Saved me a great deal of time! :)
My problem is that I have extract the namespace URLs from attributes of the header. But this is not working. I tried an example from here but that didn't help: php.net/manual/en/simplexmlelement.attributes.php
0

I think this is what you're looking for. Hope it helps ;)

3 Comments

Hey Julio, I tried that, but I think because it's a namespace in a namespace it doenst work like that.
@TimHanssen: No, that should not introduce you any problems. You just need to do it again - with multiple namespaces.
So i tried using foreach( $xml->ListRecords as $records ) { foreach( $records AS $record ) { $data = $record->children( 'openarchives.org/OAI/2.0/oai_dc' ); $rows = $data->children( 'purl.org/dc/elements/1.1' ); echo $rows->date; break; } break; } I got the error: Warning: main(): Node no longer exists
0

use DomDocument for this like access to dc:date

 $STR=' <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://dublincore.org/documents/dcmi- namespace/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd";> <responseDate>2013-04-15T12:14:31Z</responseDate> <ListRecords> <record> <header> <identifier> a1b31ab2-9efe-11df-9922-efbb156aa6c1:01442b82-59a4-627e-800f-c63de74fc109 </identifier> <datestamp>2012-08-16T14:42:52Z</datestamp> </header> <metadata> <oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd";> <dc:description>...</dc:description> <dc:date>1921</dc:date> <dc:identifier>K11510</dc:identifier> <dc:source>Waterschap Vallei & Eem</dc:source> <dc:source>...</dc:source> <dc:source>610</dc:source> <dc:coverage>Bunschoten</dc:coverage> <dc:coverage>Veendijk</dc:coverage> <dc:coverage>Spakenburg</dc:coverage> </oai_dc:dc> </metadata> <about>...</about> </record>'; $dom= new DOMDocument; $STR= str_replace("&", "&amp;", $STR); // disguise &s going IN to loadXML() // $dom->substituteEntities = true; // collapse &s going OUT to transformToXML() $dom->recover = TRUE; @$dom->loadHTML('<?xml encoding="UTF-8">' .$STR); // dirty fix foreach ($dom->childNodes as $item) if ($item->nodeType == XML_PI_NODE) $dom->removeChild($item); // remove hack $dom->encoding = 'UTF-8'; // insert proper print_r($doc->getElementsByTagName('dc')->item(0)->getElementsByTagName('date')->item(0)->textContent); 

output:

 1921 

or access to dc:source

 $source= $doc->getElementsByTagName('dc')->item(0)->getElementsByTagName('source'); foreach($source as $value){ echo $value->textContent."\n"; } 

output:

Waterschap Vallei & Eem ... 610 

or give you array

 $array=array(); $source= $doc->getElementsByTagName('dc')->item(0)->getElementsByTagName("*"); foreach($source as $value){ $array[$value->localName][]=$value->textContent."\n"; } print_r($array); 

output:

 Array ( [description] => Array ( [0] => ... ) [date] => Array ( [0] => 1921 ) [identifier] => Array ( [0] => K11510 ) [source] => Array ( [0] => Waterschap Vallei & Eem [1] => ... [2] => 610 ) [coverage] => Array ( [0] => Bunschoten [1] => Veendijk [2] => Spakenburg ) ) 

Comments

0

Using XPath makes dealing with namespaces more straightforward:

<?php // load the XML into a DOM document $doc = new DOMDocument; $doc->load('oai-response.xml'); // or use $doc->loadXML($xml) for an XML string // bind the DOM document to an XPath object $xpath = new DOMXPath($doc); // map all the XML namespaces to prefixes, for use in XPath queries $xpath->registerNamespace('oai', 'http://www.openarchives.org/OAI/2.0/'); $xpath->registerNamespace('oai_dc', 'http://www.openarchives.org/OAI/2.0/oai_dc/'); $xpath->registerNamespace('dc', 'http://purl.org/dc/elements/1.1/'); // identify each record using an XPath query // collect data as either strings or arrays of strings foreach ($xpath->query('oai:ListRecords/oai:record/oai:metadata/oai_dc:dc') as $item) { $data = array( 'date' => $xpath->evaluate('string(dc:date)', $item), // $item is the context for this query 'source' => array(), ); foreach ($xpath->query('dc:source', $item) as $source) { $data['source'][] = $source->textContent; } print_r($data); } 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.