Skip to content
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
blog content
  • Loading branch information
mireille-raad committed Mar 21, 2020
commit 4faeb85f37dbe927f6016360fd6d9cc310148594
8 changes: 8 additions & 0 deletions _drafts/2020-03-13-how-does-your-code-run.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
layout: post
title: "How RustPython works"
date: 2020-03-13 11:34:01 -0400
categories: featured
---

This post walks through the user experience of installing and using RustPython and the different crates.
7 changes: 7 additions & 0 deletions _drafts/2020-03-21-how-to-write-an-interpreter.markdown
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
layout: post
title: "How to write an interpreter"
date: 2020-03-13
---

Writing an interpreter sounds fancy. This post lists all the milestones.
107 changes: 103 additions & 4 deletions _posts/2020-03-12-thing-explainer.markdown
Original file line number Diff line number Diff line change
@@ -1,11 +1,110 @@
---
layout: post
title: "Thing explainer: interpreters, compilers, bytecode and other things you need to know."
title: "How the RustPython interpreter works - part 1"
date: 2020-03-11 11:34:01 -0400
categories: featured
---
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
When you write your code in Python, the machine cannot run it. A lot needs to happen behind the scenes first. This post will:
- Walk you through all the steps and demistify them with screenshots.
- Map the steps to the RustPython interpreter, so you can see things in action.
- Provide links for you to read up on key concepts. Trying to explain everything will require a book. The goal is to show by example.

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
This is intended for someone who has written python code and has a vague idea what an interpreter or compiler is but wants to see the parts of an interpreter in action.

---

If you have written Python code, you know that to "run your python code", first you have to "install python". If you download and install python from the official website, you are actually installing an interpreter written in the C programming language 😲 and a python virtual machine (among other things).

The C interpreter is refered to as CPython or the "default implementation" since that's what a lot of people use. But there are a couple of other options, for example:

- IPython? (ipython have their own interpreter?)
- Jython (written in Java)
- IronPython (.NET)
- PyPy (written in Python)
- Stackless

Different interpreters could have different workflows. For example [PyPy](https://www.pypy.org/) uses a just in time compiler to turn your code into machine code! The rest of the post will discuss CPython and RustPython.

The CPython interpreter has the job of turning your human code into bytecode that a python virtual machine can run. This doesn't sound very useful. What is bytecode to begin with? what is a python virtual machine? How does this interpretation happen?

# Bytecode
To avoid "re-inventing the wheel", here are two links to some good explanations:

- [Wikipedia article on bytecode](https://en.wikipedia.org/wiki/Bytecode) (🙄 lol, is it legit to link to wikipedia?)
- [Medium post by Rahul Saini](https://medium.com/@rahul77349/machine-code-vs-byte-code-vs-object-code-vs-source-code-vs-assembly-code-812c9780f24c). In ~4mins reading time Rahul explains "Machine code vs. Byte code vs. Object code vs. Source code vs. Assembly code vs. Executable code"... knowing the differences will come in handy at some point.

In a nutshell, bytecode is an intermediate code between source code and machine code. This makes it portable across multiple hardware and operating systems. Bytecode "works" as long as you implement a virtual machine that can run it. There is a performance penalty for this flexibility.

Seeing is believing. To see what bytecode looks like, you can use a Python module called [`dis`](https://docs.python.org/3/library/dis.html). dis stands for disassembler. You can write source code then see how its bytecode looks like. Here is an example:

![bytecode](/assets/media/bytecode.jpg)

Now that you know that we need bytecode and what it looks like, it is time to show you how RustPython builds it.

## How RustPython turns your code to bytecode
You can find the RustPython source code in [this repo](https://github.com/RustPython/RustPython).

Here are the main steps that RustPython currently does:
- parse the line of source code into tokens
- determine if the tokens are valid syntax
- create an Abstract Syntax Tree (AST)
- ? is there a bytecode optimization step?
- compile the AST into bytecode
- ? any other steps implemented in RustPython?

This list of steps introduces some new concepts like: tokens and abstract syntax trees. If you want to read-up on the details, check out:
- [Inside The Python Virtual Machine](https://leanpub.com/insidethepythonvirtualmachine/read#leanpub-auto-introduction) ebook.
- [Python in a nutshell / Lexical structure](https://learning.oreilly.com/library/view/python-in-a/9781491913833/ch03.html#python_language-id00003)


### Step 1: parsing source code into tokens

Assuming that you don't want to read two books, the fastest way to understand what tokens are, is to see them. Conveniently, Python comes with a [tokenizer](https://docs.python.org/3/library/tokenize.html). Here is what happen if I run the tokenizer on the function I created earlier.
`$ python -m tokenize file.py`

`file.py` has the add function that I used in the previous examples, it returns x+y.

![tokenzizing](/assets/media/tokenizing.jpg)


A picture IS worth a thousand word 😛. As you can guess, python tokens are the basic lexical components of the language.

If you want to be fancy, this parsing process is called "Lexical Analysis".

Here are the links to the RustPython and CPython tokenizers:

**`RustPython/parser/lexer.rs`** >>
[source code](https://github.com/RustPython/RustPython/blob/master/parser/src/lexer.rs)

**`cpython/Lib/tokenize.py`** >>
[source code](https://github.com/python/cpython/blob/3.8/Lib/tokenize.py)

### step2 : determine if the tokens are valid syntax

In the previous step, if you add random stuff to your function and tokenize it, it will still tokenize.

![tokenzizing](/assets/media/tokenizing-with-errors.jpg)

So don't hate on the whole interpreter when you get error messages! or at least don't hate on the tokenizer!

To determine if the tokens are valid syntax, first you need a definition of what a valid syntax is. Python has a defined "grammar" or set of rules. The official reference is on [this link](https://docs.python.org/3/reference/grammar.html). There, you will find a machine readable file. You may read a book to know the rules of python, but words are too "fluffy", the machine needs a very strict set of rules encoded in a file. [This video](https://www.youtube.com/watch?v=KGMFvy2d5OI) explains the notation and the Python grammar.
As the presenter puts it, this is the spirit of the beast (python) and it is only ~10KB 😭 (compare that to the size of python books you had to read!)

So, we have the rules of a programming language in a machine encoded format... now we need to write something that verifies that those rules were followed... This like something that should exist as an open source project!
Sure enough, there is a whole Rust parser generator framework called `LALRPOP`. It takes the tokens generated by the lexer, verifies the syntax and turns the tokens into an AST (Abstract Syntax Tree). More information and a tutorial can be found in the [LALRPOP book](https://lalrpop.github.io/lalrpop/README.html).

RustPython masks the errors from `LALRPOP ` and provides you with safer, nicer errors. You can see the code for this in `RustPython/parser/src/error.rs`

## Recap 🥴 🥵

As a recap, when you write a line of python code and "run it", here is what happens:

**your code** (in `file.py` or interactive shell)
⭣ parse the line of source code into tokens
⭣ determine if the tokens are valid syntax
⭣ create an Abstract Syntax Tree (AST)
→ up next: compile the AST into bytecode
**bytecode** (in `__pycache__/file.pyc` or in memory)

The compiler is under **`RustPython/compiler`** we'll dive into the details in a next post. In the meantime, check out the parser folder in `RustPython`.
35 changes: 31 additions & 4 deletions _posts/2020-03-12-use-cases.markdown
Original file line number Diff line number Diff line change
@@ -1,7 +1,34 @@
---
layout: post
title: "What can I use RustPython for?"
date: 2020-03-11 11:34:01 -0400
categories: featured
title: "Use case: RustPython in the web can be one of it’s best use cases"
date: 2020-03-12 11:34:01 -0400
categories: ["featured", "usecase"]
---
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Use case posts are curated from gitter chats and from github issues. You can also submit your use case as a post via Pull request. Tell us about what you are using RustPython for!

---
Submitted by: [Joetib](https://github.com/Joetib)
## RustPython in the web can be one of it's best use cases

## Summary
RustPython in the web can be one of it's best use cases.

The python community have been pushing to have python run in the browser. Many such implementations have been brython, skulpt, Batavia etc. Likened with batavia's approach to run python bytecode in the browser, I have been wondering if rustpytgon can ship only it's virtual machine to do similar task.

## Detailed Explanation
Rustpython's wasm module currently is about 4.5mb. This size is too big to be inculcated in most projects.
However I believe if we could remove the parser, lexer and other compiler related stuff and compile only the virtual machine, the size will drastically reduce.
This approach will also require that apps get compiled on their maybe machines before the bytecode is shipped.

## Drawbacks, Rationale, and Alternative Rationale:
It will take a long time for RustPython to get to compete with cpython natively (where cpython works well already) but the web is one place where competition is less.
If this works, we can have web apps fully written in python (little to no javascript) and we will be able to run already existing software on the web.
This can also lead to the revolution of building front end libraries like implementing python alternatives to react, angular, Vue etc.

Currently, the beeware team have been working hard to have the Toga(beewae's gui library) working in browsers but are limited by the size of Batavia.
I am certain everyone loved Anvil.works for their idea of creating web apps completely in python.

## Possible Drawback:
Since apps will need to be compiled before shipped and most people use cpython, rustpython's vm must be able to run cpython generated bytecode
This may not even be an issue but I raise this because I'm unaware of the bytecode implementation.