I'm struggling few days with quite complex xpath and I'm not able to formulate it. I have a syntactic tree from c++ like language parser and I would like to have xpath query, that selects all names, that are not in function name.
To be specific, I have xml document like this
(Whole xml document is on the end of the question, it is quite large I paste here a simple overview of the document structure) there are four node types
a - this element contains one node
b - contains information of the node (e.g. "CALL_EXPRESSION")
c - contains actual text (e.g. "printf", variable names...)
d - contains descendats of current node (a elements)
CALL_EXPRESSION DOT_EXPRESSION NAME_EXPRESSION NAME NAME_EXPRESSION NAME PARAMS NAME_EXPRESSION NAME CALL_EXPRESSION NAME_EXPRESSION NAME PARAMS NAME_EXPRESSION NAME ASSIGNMENT_EXPRESSION NAME_EXPRESSION NAME NAME_EXPRESSION NAME
I would like to formulate Xpath query, that would select all NAMEs that are not descendats of CALL_EXPRESSION/*[1]. (This means i would like to select all variables and not the function names).
To select all the function names I can use Xpath like this
//a[b="CALL_EXPRESSION"]/d/a[1]
no problem here. Now, if I would like to select all nodes that are not descendats of this nodes. I would use not(ancestor::X).
But here goes the problem, if I formulate the Xpath expression like this:
//*[b="NAME"][not(ancestor::a[b="CALL_EXPRESSION"]/d/a[1])]
it selects only nodes, that don't have a that has child b="CALL_EXPRESSION" at all. In our example, it selects only NAME from the ASSIGNMENT_EXPRESSION subtree.
I suspected, that the problem is, that ancestor:: takes only the first element (in our case a[b="CALL_EXPRESSION"]) and restricts according to its predicate and further / are discarded. So i modified the xpath query like this:
//*[b="NAME"][not(ancestor::a[../../b="CALL_EXPRESSION" and position()=1])]
This seems to work only on the simpler CALL_EXPRESSION (without the DOT_EXPRESSION). I suspected, that the path in [] might be relative only to current node, not to the potential ancestors. But when I used the query
//*[b="NAME"][not(ancestor::a[b="CALL_EXPRESSION"])]
it worked as one would assume (all NAMEs what don't have ancestor CALL_EXPRESSION were selected).
Is there any way to formulate the query I need? And why don't the queries work?
Thanks in advance :)
The XML
<a> <b>CALL_EXPRESSION</b> <c>object.method(a)</c> <d> <a> <b>DOT_EXPRESSION</b> <c>object.method</c> <d> <a> <b>NAME_EXPRESSION</b> <c>object</c> <d> <a> <b>NAME</b> <c>object</c> <d> </d> </a> </d> </a> <a> <b>NAME_EXPRESSION</b> <c>method</c> <d> <a> <b>NAME</b> <c>method</c> <d> </d> </a> </d> </a> </d> </a> <a> <b>PARAMS</b> <c>(a)</c> <d> <a> <b>NAME_EXPRESSION</b> <c>a</c> <d> <a> <b>NAME</b> <c>a</c> <d> </d> </a> </d> </a> </d> </a> </d> </a> <a> <b>CALL_EXPRESSION</b> <c>puts(b)</c> <d> <a> <b>NAME_EXPRESSION</b> <c>puts</c> <d> <a> <b>NAME</b> <c>puts</c> <d> </d> </a> </d> </a> <a> <b>PARAMS</b> <c>(b)</c> <d> <a> <b>NAME_EXPRESSION</b> <c>b</c> <d> <a> <b>NAME</b> <c>b</c> <d> </d> </a> </d> </a> </d> </a> </d> </a> <a> <b>ASSIGNMENT_EXPRESSION</b> <c>c=d;</c> <d> <a> <b>NAME_EXPRESSION</b> <c>c</c> <d> <a> <b>NAME</b> <c>c</c> <d> </d> </a> </d> </a> <a> <b>NAME_EXPRESSION</b> <c>d</c> <d> <a> <b>NAME</b> <c>d</c> <d> </d> </a> </d> </a> </d> </a>