Some say that “everything boils down to ones and zeros”—but do we really understand how our programs get translated into those bits? Compilers a

How to Approach Writing an Interpreter From Scratch

submited by

Style Pass

2021-06-23 15:00:16

Some say that “everything boils down to ones and zeros”—but do we really understand how our programs get translated into those bits?

Compilers and interpreters both take a raw string representing a program, parse it, and make sense of it. Though interpreters are the simpler of the two, writing even a very simple interpreter (that only does addition and multiplication) will be instructive. We’ll focus on what compilers and interpreters have in common: lexing and parsing the input.

Readers may wonder What’s wrong with a regex? Regular expressions are powerful, but source code grammars aren’t simple enough to be parsed by them. Neither are domain-specific languages (DSLs), and a client might need a custom DSL for authorization expressions, for example. But even without applying this skill directly, writing an interpreter makes it much easier to appreciate the effort behind many programming languages, file formats, and DSLs.

Writing parsers correctly by hand can be challenging with all the edge cases involved. That’s why there are popular tools, like ANTLR, that can generate parsers for many popular programming languages. There are also libraries called parser combinators, that enable developers to write parsers directly in their preferred programming languages. Examples include FastParse for Scala and Parsec for Python.