A Python Interpreter Written in Python (aosabook.org)

by xk3 54 comments 157 points
Read article View on HN

54 comments

[−] BoppreH 28d ago

> Byterun is a Python interpreter written in Python. This may strike you as odd, but it's no more odd than writing a C compiler in C.

I'm not so sure. The difference between a self-hosted compiler and a circular interpreter is that the compiler has a binary artifact that you can store.

With an interpreter, you still need some binary to run your interpreter, which will probably be CPython, making the new interpreter redundant. And if you add a language feature to the custom interpreter, and you want to use that feature in the interpreter itself, you need to run the whole chain at runtime: CPython -> Old Interpreter That Understand New Feature -> New Interpreter That Uses New Feature -> Target Program. And the chain only gets longer, each iteration exponentially slower.

Meanwhile with a self-hosted compiler, each iteration is "cached" in the form a compiled binary. The chain is only in the history of the binary, not part of the runtime.

---

Edit since this is now a top comment: I'm not complaining about the project! Interpreters are cool, and this is genuinely useful for learning and experimentation. It's also nice to demystify our tools.

[−] gwerbin 28d ago
PyPy handled this by implementing PyPy in a restricted minimal subset of Python that they called RPython, and that seemed to work out well for them.
[−] mikepurvis 27d ago
I was never a user of PyPy but I really appreciated the (successful) effort to cleanly extract from Python a layer that of essential primitives upon which the rest of the language's features and sugar could be implemented.

It's more than just what is syntax or a language feature, for example RPython provides nts classes, but only very limited multiple inheritance; all the MRO stuff is implemented using RPython for PyPy itself.

[−] paulddraper 27d ago
The key difference is that RPython is actually a compiled language.

I.e. PyPy DOESN'T have an interpreter written in an interpreted language.

[−] SJC_Hacker 28d ago
This is the case only if the new interpreter does not simply include the layer that the old interpreter has for translating bytecode to native instructions. Once you have that, you can simply bootstrap any new interpreters from previous ones. Even in the case of supporting new architectures, you can still work at the Python level to produce the necessary binary, although the initial build would have to be done on an already supported architechture.
[−] direwolf20 27d ago
Interpreters don't translate bytecode to native instructions.
[−] SJC_Hacker 27d ago
The usual understanding of "interpreter" in a CS context is program that executes source code directly without a compilation step. However the binary that translates an intermediate bytecode to native machine code is at least sometimes called a "bytecode interpreter".

https://doc.pypy.org/en/latest/interpreter.html

[−] ghusbands 27d ago
This is still incorrect. A bytecode interpreter, as its name indicates, interprets a bytecode. Typically, compiling a bytecode to native machine code is the work of a JIT compiler.
[−] genxy 27d ago
[−] ghusbands 27d ago
That's a partial evaluator, not an interpreter, and it converts an interpreter into compiler, which are different things.
[−] genxy 27d ago

> Interpreters don't translate bytecode to native instructions.

> That's a partial evaluator, not an interpreter, and it converts an interpreter into compiler, which are different things.

https://old.reddit.com/r/Compilers/comments/1sm90x5/retrofit...

[−] ghusbands 27d ago
Yes, that's another great example of the same kind of thing - creating a JIT from an interpreter. It remains true that interpreters do not directly generate machine code.
[−] genxy 27d ago
The author of weval is the top comment.

Reading the comments and understanding that transitively, weval turns interpreters into compilers, allowing interpreters to generate machine code.

[−] direwolf20 27d ago
If you turn milk into cheese it isn't milk any more, and it doesn't prove that milk is a yellow solid.
[−] genxy 23d ago
We lost the plot here.

What are your goals, to let everyone know that interpreters, definitionally don't generate code? This isn't debate club.

I dropped a cool link that shows we have a machine that turns interpreters into compilers. I am talking about the machine. You are talking about the definition. We aren't talking about the same thing.

[−] anitil 28d ago
Oooh it's a bytecode interpreter! I was wondering how they'd fit a parser/tokenizer in 500 lines unless the first was import tokenizer, parser. And it looks like 1500ish lines according to tokei

I think because python is a stack-based interpreter this is a really great way to get some exposure to how it works if you're not too familiar with C. A nice project!

[−] cestith 27d ago
The article contrasts Python to Perl, saying Perl is purely interpreted while Python has compilation. This is factually incorrect.

Perl is transformed into an AST. Then that is decorated into an opcode tree. The thing runs code nearly as fast as C in many instances, once the startup has completed and the code is actually running.

[−] throwpoaster 28d ago
[−] jgbuddy 28d ago
one liner:

eval(str)

[−] PhunkyPhil 27d ago
I can do you one better:

``python3

from openai import OpenAI

import sys

client = OpenAI()

response = client.chat.completions.create( model="gpt-4", messages=[{ "role": "user", "content": f"generate valid python byte code this program compiles to: {sys.argv[1]}" }] )

print(response.choices[0].message.content)

``

Actually, probably not better.

[−] nagaiaida 26d ago
and as soon as one tries to meaningfully add features to this sort of metainterpreter, the usefulness of homoiconic syntax becomes abundantly clear
[−] nasretdinov 28d ago
Went into comments looking for this exact comment. Wasn't disappointed
[−] _blk 27d ago
Great minds think alike ;)
[−] vachanmn123 28d ago
Very well written! Everyone used to tell me during Uni that stacks are used for running programs, never ACTUALLY understood where or how.
[−] woadwarrior01 28d ago
aka A Metacircular Interpreter
[−] blueybingo 28d ago
the article glosses over something worth pausing on: the getattr trick for dispatching instructions (replacing the big if-elif chain) is actaully a really elegant pattern that shows up in a lot of real interpreters and command dispatchers, not just toy ones -- worth studying that bit specifically if you're building anything with extensible command sets.