4

I want to understand how Python works at a base level, and this will hopefully help me understand a bit more about the inner workings of other compiled/interpreted languages. Unfortunately, the compilers class is a bit away for now. From what I read on this site and elsewhere, people answering "What base language is Python written in" seem to convey that there's a difference between talking about the "rules" of a language versus how the language rules are implemented for usage. So, is it correct to say that Python (and other high-level languages) are all essentially just sets of rules "written" in any natural language? And then the matter of how they're actually used (where used means compiled/interpreted to actually create things) can vary, with various languages being used to implement compilers? So in this case, CPython, IronPython, and Jython would be syntactically equal languages which all follow the same set of rules, just that those rules are implemented themselves in their respective languages.

Please let me know if my understanding of this is correct, if you have anything to add that might further solidify my understanding, or if I'm blatantly wrong.

1
  • You could easily implement python compiler in python itself (non-trivially, i. e. without using eval). Most C/C++ compilers are actually implemented in C/C++. For more "duuuude" stuff, see schneier.com/blog/archives/2006/01/countering_trus.html Commented Jan 17, 2018 at 4:22

3 Answers 3

7

Code written in Python should be able to run on any Python interpreter. Python is essentially a specification for a programming language with a reference implementation (CPython). Whenever the Python specifications and PEPs are ambiguous, the other interpreters usually choose to implement the same behavior, unless they have reason not to.

That being said, it's entirely possible that a program written in Python will behave differently on different implementations. This is because many programmers venture into "undefined behavior." For example, CPython has a "Global Interpreter Lock" that means only one thread is actually executing at a time (modulo some conditions), but other interpreters do not have that behavior. So, for example, there is different behaviors about atomicity (e.g., each bytecode instruction is atomic in CPython) as other interpreters.

You can consider it like C. C is a language specification, but there are many compilers implementing it: GCC, LLVM, Borland, MSVC++, ICC, etc. There are programming languages and implementations of those programming languages.

Sign up to request clarification or add additional context in comments.

1 Comment

This was extremely helpful. I didn't even know PEP existed before. Now I understand as well why professors have students upload code to their server to ensure it's running on the same specification. Had you not explained this, I would have gone forward assuming Python would be interpreted the same way by every compiler. Thanks so much for this informative response, definitely something to watch out for in the future.
4

You are correct when you make the distinction between what a language means and how it does what it means.

What it means

The first step to compiling a language is to parse its code to generate an Abstract Syntax Tree. That is a tree that defines what the code you wrote means, what it is supposed to do. By example if you have the following code

a = 1 if a: print('not zero') 

It would generate a tree that looks more or less like this.

 code ___________|______ | | declaration if __|__ ___|____ | | | | a 1 a print | 'not zero' 

This represents what the code means, but tells us nothing about how it executes it.

Edit: of course the above is far from what Python's parsers would actually generate, I made plenty of oversimplification for the purpose of readability. Luckily for us, if you are curious about what is actually generated you can import ast that provides a Python parser.

import ast code = """ a = 1 if a: print('not zero') """ my_ast = ast.parse(code) 

Enjoy inspecting my_ast.

What it does

Once you have an AST, you can convert it back to whatver you want. It can be C, it can be machine code, you can even convert it back to Python if you wish. The most used implementation of Python is CPython which is written in C.

What is going on under the hood is thus pretty close to your understanding. First, a language is a set of rules that defines a behaviour, and only then is there an implementation to that languages that defines how it does it. And yes of course, you can have different implementations of a same language with slight difference of behaviours.

3 Comments

This was so helpful. Thanks for spending the time on making that tree. One thing: why is the check for a on the same level as print there, wouldn't it sequentially execute print after it checks for a? Why is print not below a on the tree?
This AST is far from what the actual AST generated for Python code would look like. Its only purpose is to show kind of what it looks like. If you are insterested in looking at what it resembles, you can do import ast and use ast.parse. Let me add an example of that in ym answer.
thank you again. Very helpful, and helps me visualize code execution in the future. +1
-2

Basically it's a bunch of dictionary data structures implementing functions, modules, etc. The global variables and their values live in a per-module dictionary. Variables within a class are another dictionary. Those within an object are yet another dictionary and so are those within a function. Even a function call has its own dictionary so that different calls have different copies of the local variables.

It has no lexical scope unlike most other languages and, in my opinion, was designed to be implemented as simply as possible by 1 coder using dictionaries.

7 Comments

Do you have any citations for that last claim? That seems to be a rather remarkable statement.
@BryanOakley The lack of lexical scope might be explicitly documented in the language reference, but it is rather trivial to observe with some simple code (happy to construct an example). For the 1 coder bit, that's more of an observation and inference since languages are born and evolve rather chaotically. i. e. feel free to cite me :-)
I don't understand why I should cite you. You made a claim about python that doesn't seem to be backed up by anything more than a personal observation. If that's the case, you might want to say so in your answer.
I disagree wholeheartedly with this "observation and inference". Furthermore, I don't see how this answers OP's question. -1.
@BryanOakley I believe the custom in writing is to cite other people's opinion, and the "default" is understood to be the author's own. However, since it bothers you I have added "in my opinion"
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.