7

I like the options offered by the _ast module, it's really powerful. Is there a way of getting the full AST from it?

For example, if I get the AST of the following code :

import os os.listdir(".") 

by using :

ast = compile(source_string,"<string>","exec",_ast.PyCF_ONLY_AST) 

the body of the ast object will have two elements, an import object, and a expr object. However, I'd like to go further, and obtain the AST of import and listdir, in other words, I'd like to make _ast descend to the lowest level possible.

I think it's logical that this sort of thing should be possible. The question is how?

EDIT: by the lowest level possible, I didn't mean accesing what's "visible". I'd like to get the AST for the implementation of listdir as well: like stat and other function calls that may be executed for it.

3
  • Keep in mind that unless your Python code follows a couple of conventions, you have to actually execute the code in order to find out which module/function is used at a certain point in your code. Commented Jul 14, 2009 at 22:04
  • so how would I go in doing this? Commented Jul 14, 2009 at 22:30
  • You can't, in general. You could have something like: import os; import random; if random.random() > .5: os.listdir = lambda *args: None; os.listdir("."); and then it's going to be a bit tricky to figure out which code is executed. But even method calls are tricky, since you have to rebuild the class hierarchy and the mro statically. Commented Jul 14, 2009 at 23:15

2 Answers 2

8

You do get the whole tree this way -- all the way to the bottom -- but, it IS held as a tree, exactly... so at each level to get the children you have to explicitly visit the needed attributes. For example (i'm naming the compile result cf rather than ast because that would hide the standard library ast module -- I assume you only have 2.5 rather than 2.6, which is why you're using the lower-level _ast module instead?)...:

>>> cf.body[0].names[0].name 'os' 

This is what tells you that the import statement is importing name os (and that one only because 1 is the lengths of the .names field of .body[0] which is the import).

In Python 2.6's module ast you also get helpers to let you navigate more easily on a tree (e.g. by the Visitor design pattern) -- but the whole tree is there in either 2.5 (with _ast) or 2.5 (with ast), and in either case is represented in exactly the same way.

To handily visit all the nodes in the tree, in 2.6, use module ast (no leading underscore) and subclass ast.NodeVisitor as appropriate (or equivalently use ast.iter_child_nodes recursively and ast.iter_fields as needed). Of course these helpers can be implemented in pure Python on top of _ast if you're stuck in 2.5 for some reason.

Sign up to request clarification or add additional context in comments.

3 Comments

Quick question: What does cf stand for? I've seen that abbreviation a few places in the documentation and you also use it, but I have no idea what it stands for. Compiled file? Code Formatted? Code file? Code field? Compile field? Compile format?
@ArtOfWarfare, I believe I had in mind an anodyne abbreviation for "compiled form".
Thanks, that makes sense. I don't like having uncommon abbreviations in my code - I prefer variable names that are short and descriptive. So I'll call it compiledForm instead of cf.
5
py> ast._fields ('body',) py> ast.body [<_ast.Import object at 0xb7978e8c>, <_ast.Expr object at 0xb7978f0c>] py> ast.body[1] <_ast.Expr object at 0xb7978f0c> py> ast.body[1]._fields ('value',) py> ast.body[1].value <_ast.Call object at 0xb7978f2c> py> ast.body[1].value._fields ('func', 'args', 'keywords', 'starargs', 'kwargs') py> ast.body[1].value.args [<_ast.Str object at 0xb7978fac>] py> ast.body[1].value.args[0] <_ast.Str object at 0xb7978fac> py> ast.body[1].value.args[0]._fields ('s',) py> ast.body[1].value.args[0].s '.' 

HTH

6 Comments

I know how to get that. The thing is, how do I get to listdir's AST? Not the function argument, but the implementation beneath.
There is no AST for listdir - it is implemented in C.
then how would I go about obtaining all there is to obtain? how could I make _ast recurse into each python implemented function?
@Geo, as I mentioned in my answer: upgrade to 2.6, use module ast without leading underscore, and subclass ast.NodeVisitor as appropriate (or equivalently use ast.iter_child_nodes recursively and ast.iter_fields as needed).
You can't "recurse" into Python-implemented functions, either. First, if you have "foo.bar()", you can't even know which bar is being called, because of late binding - you actually have to run that code. Even for module-level functions: By merely compiling a call to them, the actual function being compiled is not considered at all. It may not exist, and even if it does exist, you may have only byte code of the function. If you want an AST of the function, you need to locate the module source, and compile it yourself. Traversing from your code to the module "automatically" is just not possible.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.