1

I'm using SimpleXML to get pieces of data from an XML web-service response. We need to create database records with the pieces. Here's my issue: this XML is structured (in my mind, anyway) very strangely, and I'm not sure how to get all the pieces that should comprise a single record together. This is data returned by the National Weather Services forecast web service. We are passing in multiple latitude/longitude pairs, a start date, and and end date, and asking it to return 3 pieces of data - wind speed, wind direction, and wave height. What it sends back is two separate location elements for each lat/long pair - one for the wind information, one for the water (wave) information. Then, outside of the location elements, there are parameters elements that are listed as "applicable" to a specific location. Here's a sample (sorry it's so long; it's kind of necessary to show how the file is structured.)

 <data> <location> <location-key>point1</location-key> <point latitude="38.99" longitude="-77.02"/> </location> <location> <location-key>point2</location-key> <point latitude="39.70" longitude="-104.80"/> </location> <location> <location-key>point3</location-key> <point latitude="47.60" longitude="-122.30"/> </location> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p3h-n34-1</layout-key> <start-valid-time>2011-02-22T16:00:00-05:00</start-valid-time> <start-valid-time>2011-02-22T19:00:00-05:00</start-valid-time> </time-layout> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p6h-n17-2</layout-key> <start-valid-time>2011-02-22T19:00:00-05:00</start-valid-time> <start-valid-time>2011-02-23T01:00:00-05:00</start-valid-time> </time-layout> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p3h-n34-3</layout-key> <start-valid-time>2011-02-22T14:00:00-07:00</start-valid-time> <start-valid-time>2011-02-22T17:00:00-07:00</start-valid-time> </time-layout> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p6h-n17-4</layout-key> <start-valid-time>2011-02-22T17:00:00-07:00</start-valid-time> <start-valid-time>2011-02-22T23:00:00-07:00</start-valid-time> </time-layout> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p3h-n34-5</layout-key> <start-valid-time>2011-02-22T13:00:00-08:00</start-valid-time> <start-valid-time>2011-02-22T16:00:00-08:00</start-valid-time> </time-layout> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p6h-n17-6</layout-key> <start-valid-time>2011-02-22T16:00:00-08:00</start-valid-time> <start-valid-time>2011-02-22T22:00:00-08:00</start-valid-time> </time-layout> <parameters applicable-location="point1"> <wind-speed type="sustained" units="knots" time-layout="k-p3h-n34-1"> <name>Wind Speed</name> <value>5</value> <value>5</value> </wind-speed> <direction type="wind" units="degrees true" time-layout="k-p3h-n34-1"> <name>Wind Direction</name> <value>340</value> <value>350</value> </direction> <water-state time-layout="k-p6h-n17-2"> <waves type="significant" units="feet"> <name>Wave Height</name> <value xsi:nil="true"/> <value xsi:nil="true"/> </waves> </water-state> </parameters> <parameters applicable-location="point2"> <wind-speed type="sustained" units="knots" time-layout="k-p3h-n34-3"> <name>Wind Speed</name> <value>4</value> <value>2</value> </wind-speed> <direction type="wind" units="degrees true" time-layout="k-p3h-n34-3"> <name>Wind Direction</name> <value>180</value> <value>200</value> </direction> <water-state time-layout="k-p6h-n17-4"> <waves type="significant" units="feet"> <name>Wave Height</name> <value xsi:nil="true"/> </waves> </water-state> </parameters> <parameters applicable-location="point3"> <wind-speed type="sustained" units="knots" time-layout="k-p3h-n34-5"> <name>Wind Speed</name> <value>7</value> <value>8</value> </wind-speed> <direction type="wind" units="degrees true" time-layout="k-p3h-n34-5"> <name>Wind Direction</name> <value>290</value> </direction> <water-state time-layout="k-p6h-n17-6"> <waves type="significant" units="feet"> <name>Wave Height</name> <value xsi:nil="true"/> </waves> </water-state> </parameters> </data> 

What I need to end up with is three SQL insert statements for each location that looks something like this:

INSERT into gl_weather_data (weather_lat, weather_long, weather_time, weather_type, weather_value, weather_unit) values ("46.72", "-91.82", "2011-02-22T12:00:00-06:00", "Wind Speed", "9", "knots") INSERT into gl_weather_data (weather_lat, weather_long, weather_time, weather_type, weather_value, weather_unit) values ("46.72", "-91.82", "2011-02-22T12:00:00-06:00", "Wind Direction", "120", "degrees true") INSERT into gl_weather_data (weather_lat, weather_long, weather_time, weather_type, weather_value, weather_unit) values ("46.72", "-91.82", "2011-02-22T12:00:00-06:00", "Wave Height", "2", "feet") 

The data for the lat/long/time would come from these elements and are common to all three inserts:

<location> <location-key>point1</location-key> <point latitude="46.72" longitude="-91.82" /> </location> <time-layout time-coordinate="local" summarization="none"> <layout-key>k-p3h-n34-1</layout-key> <!-- note there can be more than one start-valid-time elements; we only want the first one --> <start-valid-time>2011-02-22T16:00:00-05:00</start-valid-time> <start-valid-time>2011-02-22T19:00:00-05:00</start-valid-time> </time-layout> 

The data for type, value, and unit come from different elements depending on what type of information we're getting:

<!-- wind speed; need to get "knots" out of the wind-speed element's unit param, "Wind Speed" out of the name element, and "5" out of the value element (note there can be more than one value element; we only want the first) --> <parameters applicable-location="point1"> <wind-speed type="sustained" units="knots" time-layout="k-p3h-n34-1"> <name>Wind Speed</name> <value>5</value> <value>5</value> </wind-speed> <!-- wind direction; need to get "degrees true" out of the direction element's unit param, "Wind Direction" out of the name element, and "340" out of the value element (note there can be more than one value element; we only want the first) --> <direction type="wind" units="degrees true" time-layout="k-p3h-n34-1"> <name>Wind Direction</name> <value>340</value> <value>350</value> </direction> <!-- wave height; need to get "feet" out of the waves element's unit param, "Wave Height" out of the name element, and "13" out of the value element (note there can be more than one value element; we only want the first) --> <water-state time-layout="k-p6h-n17-2"> <waves type="significant" units="feet"> <name>Wave Height</name> <value>13</value> <value xsi:nil="true"/> </waves> </water-state> 

My main confusion is how to associate the location elements with their corresponding parameters elements; if I had written the schema I probably would have made parameters a child of location, but that's not how it's coming to us, and obviously we can't easily change it.

I'm guessing probably I need to do a for-each on location elements, get the location key of the first one, then somehow use that location key to select the correct parameters element by matching it to the parameters applicable-location, but I have no idea how to do this with SimpleXML. Can anyone help me out here?

EDITED TO ADD WORKING/NON-WORKING CODE

This works - it's nowhere near to what I need to do, but at least I get a result:

$dwml = simplexml_load_string($result); foreach ($dwml->data->parameters as $r) { $locName = $r->direction['type']; echo "Name: $locName<br />"; } foreach ($dwml->data->location as $r) { echo "location key: " . $r->{'location-key'} . "<br />"; } 

This doesn't work:

$data = simplexml_load_string($result); $all_locations = $data->xpath('location'); foreach( $all_locations as $location ) { list($location_key) = $location->xpath('location-key[1]'); $params = $data->xpath("parameters[@applicable-location='{$location_key}']/*"); foreach( $params as $param ) { list($time) = $data->xpath("time-layout[layout-key='{$param['time-layout']}']/start-valid-time[1]"); if( $param->getName() == 'water-state' ) { $param = $param->waves; } $sql = "INSERT into gl_weather_data values ('{$location->point['latitude']}', '{$location->point['longitude']}', '{$time}', '{$param->name}', '{$param->value[0]}', '{$param['units']}')"; echo "{$sql}\n\n"; } } 

EDITED AGAIN OK, I think I got it - here's what actually works for me (dwml is actually the root element, not data):

$dwml = simplexml_load_string($result); $all_locations = $dwml->data->xpath('location'); foreach( $all_locations as $location ) { list($location_key) = $location->xpath('location-key[1]'); $params = $dwml->data->xpath("parameters[@applicable-location='{$location_key}']/*"); foreach( $params as $param ) { list($time) = $dwml->data->xpath("time-layout[layout-key='{$param['time-layout']}']/start-valid-time[1]"); if( $param->getName() == 'water-state' ) { $param = $param->waves; } $sql = "INSERT into gl_weather_data values ('{$location->point['latitude']}', '{$location->point['longitude']}', '{$time}', '{$param->name}', '{$param->value[0]}', '{$param['units']}')"; echo "{$sql}\n\n"; } } 
3
  • That is not pretty XML. You have my sympathies. Commented Feb 23, 2011 at 1:28
  • Yeah, a co-worker commented that it's pretty much what you can expect from a free service created by a government agency! Commented Feb 23, 2011 at 15:56
  • Actually, it's a fully normalized data structure. This is generally considered a good thing--in xml design as well as database design. Normalization exposes the relationships among the various entities in a way that makes arbitrary queries and/or transformations on those relationships simple and efficient. You just need to use the right tools. For XML in PHP, the right tool is XPath. Commented Feb 23, 2011 at 17:24

1 Answer 1

2

I would suggest that you use SimpleXML's XPath functionality.

You'll end up with something like this:

$data = new SimpleXMLElement($string); $all_locations = $data->xpath('location'); foreach( $all_locations as $location ) { list($location_key) = $location->xpath('location-key[1]'); $params = $data->xpath("parameters[@applicable-location='{$location_key}']/*"); foreach( $params as $param ) { list($time) = $data->xpath("time-layout[layout-key='{$param['time-layout']}']/start-valid-time[1]"); if( $param->getName() == 'water-state' ) { $param = $param->waves; } $sql = "INSERT into gl_weather_data values ('{$location->point['latitude']}', '{$location->point['longitude']}', '{$time}', '{$param->name}', '{$param->value[0]}', '{$param['units']}')"; echo "{$sql}\n\n"; } } 

edit

The example above queries for all the <location> elements, then finds the <parameters> section that goes with each location. It occurs to me that you might prefer to approach it the other way -- find all the <parameters> elements, then look up the related <location>.

Using XPath, this change is fairly simple:

$data = new SimpleXMLElement($string); $all_params = $data->xpath("parameters"); foreach( $all_params as $paramblock ) { list($location) = $data->xpath("location[location-key='{$paramblock['applicable-location']}']"); foreach( $paramblock->children() as $item ) { list($time) = $data->xpath("time-layout[layout-key='{$item['time-layout']}']/start-valid-time[1]"); if( $item->getName() == 'water-state' ) { $item = $item->waves; } $sql = "INSERT into gl_weather_data values ('{$location->point['latitude']}', '{$location->point['longitude']}', '{$time}', '{$item->name}', '{$item->value[0]}', '{$item['units']}')"; echo "{$sql}\n\n"; } } 
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks, that looks like what I need to do. However, for some reason whenever I use new SimpleXMLElement($result) (as opposed to simplexml_load_string($result)) I end up with a blank page - no visible errors, but nothing displaying in the view-source or actual page.
The function simplexml_load_string(), and the SimpleXMLElement constructor both return a SimpleXMLElement object instance. So you can use either approach, the end result is the same. Not sure why it's failing for you though. You may want to be sure that display_startup_errors is enabled on your development server.
OK, so it's not just SimpleXMLElement that's causing the problem. I'm adding code in the OP to show what works so far and what doesn't; I'm very confused.
As you apparently discovered, the structure of your XML is important -- You showed <data> as the root of your xml document, but if <data> is not the root node, then you'll need to traverse down to the <data> node before the above code will work.
Also, I've added a second example (still using xpath) that approaches the problem from a slightly different angle -- not sure which one will be better suited for your needs, so I figured I'd share both.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.