An AOT compiler for COOL (Classroom Object Oriented Language), targeting the MIPS 32-bit Architecture and written entirely in Python 3.
COOL is a small statically-typed object-oriented language that is type-safe and garbage collected. It has mainly 3 primitive data types: Integers, Strings and Booleans (true
, false
). It supports conditional and iterative control flow in addition to pattern matching. Everything in COOL is an expression! Many example COOL programs can be found under the /examples directory.
A BNF-based specification of COOL's Context-Free Grammar can be found at /docs/Grammar.md.
PyCOOLC follows classical compiler architecture, it consists mainly of the infamous two logical components: Frontend and Backend.
The flow of compilation goes from Frontend to Backend, passing through the stages in every component.
Compiler Frontend consists of the following three stages:
- Lexical Analysis (see:
lexer.py
): regex-based tokenizer. - Syntax Analysis (see:
parser.py
): an LALR(1) parser. - Semantic Analysis (see:
semanalyser.py
).
Compiler Backend consists of the following two stages:
- Code Optimization.
- Code Generation:
- Targets the MIPS 32-bit architecture.
- Models an SRSM (Single-Register Stack Machine).
A typical compilation scenario would start by the user calling the compiler driver (see: pycoolc.py
) passing to it one or more COOL program files. The compiler starts off by parsing the source code of all program files, lexical analysis, as a stage, is driven by the parser. The parser returns an Abstract Syntax Tree (see: ast.py
) representation of the program(s) if parsing finished successfully, otherwise the compilation process is terminated and errors reported back the user. The compiler driver then initiates the Semantic Analysis stage, out of which the AST representation will be further modified. If any errors where found during this stage, the compilation process will be terminated with all errors reported back. The driver goes on with compilation process, entering the Code Optimization stage where the AST is optimized and dead code is eliminated, after which the Code Generation stage follows, emitting executable MIPS 32-bit assembly code.
Each Compiler stage and Runtime feature is designed as a separate component that can be used standalone or as a Python module, the following is the development status of each one:
Compiler Stage | Python Module | Issue(s) | Status |
---|---|---|---|
Lexical Analysis | lexer.py |
#2 | ✅ done |
Parsing | parser.py |
#3 | ✅ done |
Semantic Analysis | semanalyser.py |
#4 | in progress |
Optimization | - | #5, #11 | - |
Code Generation | - | #6 | - |
Garbage Collection | - | #8 | - |
- Python >= 3.5
- SPIM - MIPS 32-bit Assembly Simulator: @Homepage, @SourceForge.
- All Python packages listed in:
requirements.txt
.
python3 setup.py install
Coming soon...
Help and usage information:
pycoolc --help
Compile a cool program:
pycoolc hello_world.cl
Specify a custom name for the compiled output program:
pycoolc hello_world.cl --outfile helloWorldAsm.s
Run the compiled program (MIPS machine code) with the SPIM simulator:
spim helloWorldAsm.s
from pycoolc.lexer import make_lexer
from pycoolc.parser import make_parser
lexer = make_lexer()
lexer.input(a_cool_program_source_code_str)
for token in lexer:
print(token)
parser = make_parser()
parsing_result = parser.parse(a_cool_program_source_code_str)
print(parsing_result)
- Primitive Data Types:
- Integers.
- Strings.
- Booleans (
true
,false
).
- Object Oriented:
- Class Declaration.
- Object Instantiation.
- Inheritance.
- Class Attributes.
- Class Methods.
- Strong Static Typing.
- Pattern Matching.
- Control Flow:
- Switch Case.
- If/Then/Else.
- While Loops.
- Automatic Memory Management:
- Garbage Collection.
- Engineering a Compiler, Cooper and Torczon - Amazon
- Modern Compiler Implementation in ML, Appel - www, Amazon
- Stanford's Compiler Theory Course - www12, www16, YouTube
This project is licensed under the MIT License.
All copyrights of the files and documents under the /docs directory belong to their original owners.