0
$\begingroup$

Please, consider an XML document containing these fields:

... <example>An example</example> <project>A project</project> <projectName>A project name</projectName> <projectDate>A project date</projectDate> ... 

To pick up one, this code suffices:

Cases[dataXML, XMLElement["project", __, __], Infinity] 

But what if I need all fields whose name starts with "project" ?

None of these is appropriate:

Cases[dataXML, XMLElement["project" ~~ _, __, __], Infinity] Cases[dataXML, XMLElement["project" ~~ __, __, __], Infinity] Cases[dataXML, XMLElement["project" ~~ ___, __, __], Infinity] 

and, similarly for regular expressions, too.

An obvious, although a bit deceptive, escamotage is:

data = ToString @ dataXML; ptr = Shortest @ RegularExpression["XMLElement\\[project[^\\]]*\\]"]; StringCases[data, ptr] 

nevertheless, I would like to understand the motives of the former failure and if it teaches a broader lesson. Bye !

$\endgroup$
1

2 Answers 2

3
$\begingroup$

You should use Condition with StringMatchQ:

Cases[dataXML, XMLElement[tag_String /; StringMatchQ[tag, "project*"], __], Infinity] 

because Cases doesn't support string patterns.

As to why it is designed in such a way, I would cite Leonid Shifrin:

I would say that the reason is dead simple <…>. Cases and DeleteCases work on parsed expressions, while string functions work on strings. These are just so different that mixing them together would be a very wrong design decision IMO.

A more detailed discussion you can find in this answer by WReach and in the comments under the answer by R. M..

$\endgroup$
1
  • $\begingroup$ We posted in the same moment: great !! Thanks for the reply ! $\endgroup$ Commented May 22, 2016 at 16:51
0
$\begingroup$

That's not really an answer, the main point is left totally ununderstood, but it's an useful workaround.

Suppose that xml data are:

<test> <example>An example</example> <project>A project</project> <projectName>A project name</projectName> <projectDate>A project date</projectDate> </test> 

This code:

tagsList = Import[fileIn, {"XML", "Tags"}] requestedTags = Select[tagsList, StringMatchQ[#, RegularExpression["project.*"]] &]; Cases[dataXML, XMLElement[#, __, __], Infinity] & /@ requestedTags 

accomplishes the goal:

{{XMLElement["project", {}, {"a project"}]}, {XMLElement[ "projectDate", {}, {"a project date"}]}, {XMLElement[ "projectName", {}, {"a project name"}]}} 
$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.