FCC (Forth Compiler) is a minimal and pedagogical Forth compiler written in C that generates assembly code for the FASM assembler. It supports a Forth-like syntax and serves as both a learning tool for compiler construction and a functional compiler for simple programs.
- Author: Chris Curl
- License: MIT License (c) 2025
- Language: C
- Target: 32-bit x86 Assembly (FASM format)
- fcl: Forth compiler for Linux systems
- fcw: Forth compiler for Windows systems
The compiler follows a streamlined three-phase approach:
- IRL Generation - Parse source and generate Intermediate Representation Language (IRL)
- Optimization - Perform peephole optimizations on the IRL
- Code Generation - Output assembly code for Linux x86
#define VARS_SZ 500 // Maximum number of variables/symbols
#define STRS_SZ 500 // Maximum number of string literals
#define CODE_SZ 5000 // Maximum number of IRL instructions
#define HEAP_SZ 5000 // Maximum number of characters in the HEAP
typedef struct {
char type; // 'I'=Integer, 'F'=Function, 'T'=Target
char name[23]; // Symbol name
char asmName[8]; // Generated assembly name
int sz; // Size in bytes
} SYM_T;
typedef struct {
char name[32]; // Generated string name (S1, S2, etc.)
char *val; // String value (heap allocated)
} STR_T;
next_ch()
- Advances to next character, handles line reading and EOFnext_line()
- Reads next line from input filenext_token()
- Extracts next token, handles comments (//
) and numbers
checkNumber(char *w, int base)
- Parses numbers in multiple bases:- Binary:
%1010
(prefix%
) - Decimal:
#123
or123
(prefix#
or none) - Hexadecimal:
$FF
(prefix$
) - Character literals:
'Y'
(single quotes) - Supports negative numbers with
-
prefix
- Binary:
findSymbol(char *name, char type)
- Locates symbol by name and typeaddSymbol(char *name, char type)
- Adds new symbol to tablegenTargetSymbol()
- Generates unique target labels (Tgt1, Tgt2, etc.)
addString(char *str)
- Adds string literal to string tabledumpSymbols()
- Outputs symbol declarations in assembly format
The compiler uses an internal instruction set:
Stack Operations:
PUSHA
,POPA
- Push/pop accumulatorSWAP
,SP4
- Stack manipulationPOPB
- Pop to second register
Memory Operations:
STORE
,FETCH
- 32-bit memory store/loadCSTORE
,CFETCH
- 8-bit (byte) memory store/loadLOADSTR
- Load string address
Arithmetic:
ADD
,SUB
,MULT
,DIVIDE
- Basic arithmeticDIVMOD
- Division with both quotient and remainderLT
,GT
,EQ
,NEQ
- ComparisonsAND
,OR
,XOR
- Bitwise operations
Control Flow:
TESTA
- Test accumulator against zeroJMP
,JMPZ
,JMPNZ
- Conditional/unconditional jumpsTARGET
- Jump target labelsDEF
,CALL
,RETURN
- Function definition and calls
Register Operations:
MOVAB
,MOVAC
,MOVAD
- Copy accumulator to EBX, ECX, EDXSYS
- System call interrupt
Special:
LIT
- Literal valuesPLEQ
- Plus-equals operation (+!
)INCTOS
,DECTOS
- Increment/decrement top of stack
Variables:
var myVar // Declare integer variable
Functions:
: myFunc // Function definition
42 myVar ! // Store 42 in myVar
; // End function
Control Structures:
condition if // Conditional execution
// code
then
begin // Loops
// code
condition
while // While loop
again // Infinite loop
until // Until loop
Stack Operations:
42 // Push literal
dup // Duplicate top
drop // Remove top
swap // Swap top two
over // Copy second to top
Memory Operations:
@ // Fetch 32-bit value from address
! // Store 32-bit value to address
c@ // Fetch 8-bit (byte) value from address
c! // Store 8-bit (byte) value to address
+! // Add to memory location
1+ 1- // Increment/decrement TOS
Register and System Operations:
->reg1 // Copy TOS to EAX (no-op, already in EAX)
->reg2 // Copy TOS to EBX
->reg3 // Copy TOS to ECX
->reg4 // Copy TOS to EDX
sys // Execute system call (INT 0x80)
Arithmetic and Logic:
+ - * / // Basic arithmetic
/mod // Division with quotient and remainder
< = <> > // Comparisons
AND OR XOR // Bitwise operations
Source Code Comments:
// // Comment until the end of the line
( ... ) // In-line comment
- Direct system calls via
sys
command - ELF executable format
- No external library dependencies
- Custom function call convention using EBP stack
cd fcl # Change directory
make fcl # Compile the fcl program
./fcl > output.asm # Compile fcl.fth to assembly code
./fcl myfile.fth > output.asm # Compile specific file to assembly code
fasm output.asm program # Assemble to executable using FASM
chmod +x program # Make the program executable
./program # Run the program
- Syntax errors show line number, column, and source context
- Fatal errors terminate compilation
- Warnings are displayed as comments in output
var counter
var limit
: increment counter @ 1+ counter ! ;
: mil ( n--m ) 1000 dup * * ;
: main
0 counter !
1 mil limit !
begin
counter @ .d " " puts
increment
counter @ limit @ =
until
"Done!" puts
;
- Input Processing - Read source file (defaults to
fcl.fth
if no argument provided) - IRL Generation - Parse declarations and generate intermediate representation
- Optimization - Perform peephole optimizations on IRL
- Code Generation - Output ELF assembly with startup code and runtime support
- Symbol Output - Generate variable and string declarations in data section
- No built-in I/O functions (must use system calls via
sys
) - Limited error checking and recovery
- No floating-point support
- Fixed-size tables and heap
- Basic optimization only (peephole)
else
clause not yet implemented
- Byte and word memory access (
c@
,c!
,@
,!
) - Direct system call support via register operations
- Multi-base number literals (binary, decimal, hex, character)
- Integrated optimization pass
- Compact, self-contained compiler
- Clean separation of IRL generation and code emission
- Stack-based execution model with register access
This compiler serves as an example of a minimal but functional compiler implementation, demonstrating core compiler concepts in a clear and understandable way.