Simple Lisp interpreter made in C with no dependencies.
Some good resources for learning about Lisp:
- Structure and Interpretation of Computer Programs by Hal Abelson and Gerald Jay Sussman.
- ANSI Common Lisp by Paul Graham.
- The Official Scheme website.
- The MIT Scheme Reference.
- The GNU Guile manual.
- The Scheme R5RS Specification.
- An Introduction to Programming in Emacs Lisp by Robert J. Chassell.
- Build Your Own Lisp by Daniel Holden, although this project doesn’t follow it.
$ git clone https://github.com/8dcc/sl
$ cd sl
$ make
...
The full manual can be found in the doc folder. It’s meant to be exported from
.org
to .texi
, and then from .texi
to .pdf
, .html
, .info
, etc.
The test folder also contains the Lisp code used for testing the interpreter.
$ ./sl
sl> (+ 1 2 (- 5 4) (* 3 4) 10)
26
sl> (define my-addition +)
<primitive 0x55d31f51ba21>
sl> (define my-global 10.0)
10.000000
sl> (define my-lambda
(lambda (x y)
(* my-global (my-addition x y))))
<lambda>
sl> (my-lambda 3 4)
70.000000
sl> (define defmacro
(macro (name formals body)
(list 'define name (list 'macro formals body))))
<macro>
sl> (defmacro inc (var)
(list 'define var (list '+ var 1)))
<macro>
sl> (inc my-global)
11.000000
sl> my-global
11.000000
This Lisp has a few components that are responsible for their own data. This is the basic REPL process:
- The user input is read using
read_expr()
, defined in read.c. This function will read a single Lisp expression across lines, and save the data in an allocated string. - The raw user input is converted into an array of
Token
structures using thetokenize()
function, which calls the static functionget_token()
. This step might be redundant for a simple language as Lisp, but I decided to do it anyway. After this step theToken
array could be printed usingtoken_print()
, if needed. All these functions are defined in lexer.c. - The
Token
array is converted into a linked list ofExpr
structures using theparse()
function from parser.c. This linked list is essentially the Abstract Syntax Tree (AST). At this point, nothing has been evaluated; it should just be a different representation of the user input. All the values allocated bytokenize()
have been copied into the AST instead of reused, so they can be freed safely. - We evaluate the expression using
eval()
, defined in eval.c. This function will return another linked list ofExpr
structures but, just likeparse()
, it will not reuse any data in the heap, so the oldExpr*
can be freed safely. This is the rough evaluation process:- Before evaluating the expression, it checks if the current expression is a
Special Form, which are evaluated differently from procedure calls. For
example
quote
,lambda
orif
. The arguments of these special forms are passed to the C primitives (defined in primitives.c) un-evaluated. - If it wasn’t a special form, the expression type is checked. Some expressions like numbers evaluate to themselves, while some others don’t.
- If the expression was a symbol, the environment is searched to find the
expression bound to that symbol. This is done with the
env_get()
function, defined in env.c. - If the expression was a list, it is evaluated as a procedure or macro
call. This is the evaluation process of a call:
- The first element of the list is evaluated. It should return either a lambda, a macro or a C primitive.
- If the function was a macro, the arguments are not evaluated and they
are passed to
macro_call()
, defined in lambda.c. From there, the macro is expanded withmacro_expand()
and the expanded expression is evaluated and returned. For more information on how macros behave in this Lisp, see Emacs Lisp manual. - If the function was not a macro, each argument should be evaluated
before calling the function. This is done with the
eval_list()
static function. Then the function is applied to the arguments withapply()
, also defined in eval.c. - The function type is checked inside
apply()
. If it’s a C primitive, the function pointer stored in theExpr
is called with the arguments we got fromeval()
. If it’s a lambda, it is called usinglambda_call()
, defined in lambda.c. - The
lambda_call()
function operates on theLambdaCtx
structure of theExpr
. It binds each formal argument to the lambda’s environment; sets the parent environment (so the body can access globals); and evaluates each expression in the body in order, returning the last one.
- Before evaluating the expression, it checks if the current expression is a
Special Form, which are evaluated differently from procedure calls. For
example
- After that, we print the evaluated expression using
expr_print()
, defined in expr.c.
These are some things that need to be done. Feel free to make a PR if you want to contribute.
In the following example:
(define get-inner
(lambda (a)
(lambda (b)
(+ a b))))
(define inner (get-inner 10))
(inner 20) ; Error: Unbound symbol `a'
The function inner
can’t access a
, even though it was defined when get-inner
returned the lambda. This happens because the parent environment of inner
is set
whenever it gets called, not when it’s defined. This means that it can access
global symbols, and even a
as long as inner
was called within the body of
get-inner
.
The following code defines a recursive procedure that performs an iterative process.
(defun sum-iter (i end total)
(if (> i end)
total
(sum-iter (+ i 1)
end
(+ total i))))
(sum-iter 1 5 0) ; 15
Even though that procedure is recursive, since it calls itself, the process is iterative, because it has all the necessary information for continuing the computation in its parameters. The interpreter doesn’t need to keep track of where it was called from, it can just jump to the start of the function with the new parameters and no information will be lost. This jump optimization is called tail-call optimization, and an interpreter with this feature is called tail-recursive. For more information, see section 1.2.1 of SICP.