Skip to content

Commit

Permalink
Merge pull request #46 from zirco-lang/docs
Browse files Browse the repository at this point in the history
add documentation to most compiler functions
  • Loading branch information
thetayloredman authored Oct 12, 2023
2 parents a9f8a4a + 26e2471 commit 6f673b5
Show file tree
Hide file tree
Showing 10 changed files with 169 additions and 76 deletions.
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# zrc
# The Zirco Programming Language

This is the main source code repository for `zrc`, the compiler of the Zirco programming language.

The compiler is written solely in Rust.

To compile a Zirco file, you can use:

`cargo run -- ./FILE.zr`
to llvm ir
6 changes: 6 additions & 0 deletions compiler/zrc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# `zrc` --- the official Zirco compiler

This crate serves as the frontend and binary for `zrc`, the official compiler for the Zirco
programming language.

When running `zrc`, you simply invoke it with one argument, which is the file to compile.
3 changes: 1 addition & 2 deletions compiler/zrc/src/main.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
//! The Zirco compiler
#![doc=include_str!("../README.md")]
#![warn(
clippy::cargo,
clippy::nursery,
Expand Down
11 changes: 11 additions & 0 deletions compiler/zrc_parser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Parser for the Zirco programming language

This crate consists of a few modules that all wrap around the first step in the compilation process:
Converting the input source file to something machine-processable.

It contains the lexer, parser, and abstract syntax tree representation for the compiler.

The majority of your interaction with this crate will be through the [`parser::parse_program`] function and the structures present within the [`ast`] module.

For more information on how to use the parser, [read the parser documentation](parser).
If you're looking to work with the AST, [read the AST's documentation](ast).
1 change: 1 addition & 0 deletions compiler/zrc_parser/build.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
extern crate lalrpop;

fn main() {
// Generate the LR parser from our grammar
lalrpop::process_root().unwrap();
}
3 changes: 1 addition & 2 deletions compiler/zrc_parser/src/ast/expr.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
//! Expression representation for the Zirco AST
//!
//! The main thing within this module you will need is the [`Expr`] struct. It
//! contains all the different expression kinds in Zirco.
//! The main thing within this module you will need is the [`Expr`] struct.
use std::fmt::Display;

Expand Down
2 changes: 1 addition & 1 deletion compiler/zrc_parser/src/internal_parser.lalrpop
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ pub Program: Vec<Spanned<Declaration>> = Spanned<Declaration>*;
// See https://en.wikipedia.org/wiki/Dangling_else#Avoiding_the_conflict_in_LR_parsers for why this
// is necessary. The ClosedStmt rule REQUIRES that an 'else' clause is used, meaning that it will
// attach to the inner 'if' statement.
pub Stmt: Stmt = {
Stmt: Stmt = {
SpannedStmt<OpenStmt>,
SpannedStmt<ClosedStmt>
}
Expand Down
78 changes: 53 additions & 25 deletions compiler/zrc_parser/src/lexer.rs
Original file line number Diff line number Diff line change
@@ -1,20 +1,48 @@
//! Lexer for the Zirco programming language
//! Lexer and lexical errors

Check failure on line 1 in compiler/zrc_parser/src/lexer.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/lexer.rs
//!
//! This module contains a wrapper around a [logos] lexer that splits an input Zirco text into its individual tokens, which can be then passed into the internal Zirco parser.
//!
//! You do not usually need to use this crate, as the [parser](super::parser) already creates [`ZircoLexer`] instances for you before passing them to the internal parser. However, there are some cases where it may be helpful, so it is kept public.
//!
//! # Example
//! ```
//! use zrc_parser::lexer::{ZircoLexer, Tok};
//! let mut lex = ZircoLexer::new("2 + 2");
//! assert_eq!(lex.next(), Some(Ok((0, Tok::NumberLiteral("2".to_string()), 1))));
//! assert_eq!(lex.next(), Some(Ok((2, Tok::Plus, 3))));
//! assert_eq!(lex.next(), Some(Ok((4, Tok::NumberLiteral("2".to_string()), 5))));
//! assert_eq!(lex.next(), None);
//! ```
//!
//! For more information, read the documentation of [`ZircoLexer`].
use std::{error::Error, fmt::Display};
use std::fmt::Display;

use logos::{Lexer, Logos};

/// Represents some token within a certain span
/// Represents a lexer token within a certain span, or an error.

Check failure on line 23 in compiler/zrc_parser/src/lexer.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/lexer.rs
pub type Spanned<Tok, Loc, Error> = Result<(Loc, Tok, Loc), Error>;

/// An error possibly encountered during lexing
/// The error enum passed to the internal logos [`Lexer`]. Will be converted to a [`LexicalError`] by [`ZircoLexer`].
///
/// Do not use publicly.
#[derive(Debug, Clone, PartialEq, Eq, Default)]
pub enum LexicalError {
/// A generic lexing error. Other more specialized errors are more common.
pub enum InternalLexicalError {

Check failure on line 30 in compiler/zrc_parser/src/lexer.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/lexer.rs
/// A generic lexing error. This is later converted to [`LexicalError::UnknownToken`].
#[default]
NoMatchingRule,
/// A string literal was left unterminated.
UnterminatedStringLiteral(usize),
/// A block comment ran to the end of the file. Remind the user that block
/// comments nest.
UnterminatedBlockComment,
}

/// An error encountered during lexing.
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum LexicalError {
/// An unknown token was encountered.
UnknownToken((usize, char), String),
UnknownToken(usize, char),
/// A string literal was left unterminated.
UnterminatedStringLiteral(usize),
/// A block comment ran to the end of the file. Remind the user that block
Expand All @@ -24,20 +52,14 @@ pub enum LexicalError {
impl Display for LexicalError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Self::NoMatchingRule => write!(f, "No matching rule"),
Self::UnknownToken((pos, char), msg) => {
write!(f, "Unknown token '{char}' at position {pos}: {msg}")
}
Self::UnknownToken(pos, char) => write!(f, "Unknown token '{char}' at position {pos}"),
Self::UnterminatedStringLiteral(pos) => {
write!(f, "Unterminated string literal at position {pos}")
}
Self::UnterminatedBlockComment => {
write!(f, "Unterminated block comment at end of file")
}
Self::UnterminatedBlockComment => write!(f, "Unterminated block comment"),
}
}
}
impl Error for LexicalError {}

/// A lexer callback helper to obtain the currently matched token slice.
fn string_slice(lex: &Lexer<'_, Tok>) -> String {
Expand All @@ -49,7 +71,9 @@ fn string_slice(lex: &Lexer<'_, Tok>) -> String {
/// the lexer and basically consumes characters in our input until we reach the
/// end of the comment. See also: [logos#307](https://github.com/maciejhirsz/logos/issues/307)
/// See also: [zrc#14](https://github.com/zirco-lang/zrc/pull/14)
fn handle_block_comment_start(lex: &mut Lexer<'_, Tok>) -> logos::FilterResult<(), LexicalError> {
fn handle_block_comment_start(
lex: &mut Lexer<'_, Tok>,
) -> logos::FilterResult<(), InternalLexicalError> {
let mut depth = 1;
// This contains all of the remaining tokens in our input except for the opening
// to this comment -- that's already been consumed.
Expand Down Expand Up @@ -94,14 +118,16 @@ fn handle_block_comment_start(lex: &mut Lexer<'_, Tok>) -> logos::FilterResult<(
} else {
// This means we've reached the end of our input still in a comment.
// We can throw an error here.
logos::FilterResult::Error(LexicalError::UnterminatedBlockComment)
logos::FilterResult::Error(InternalLexicalError::UnterminatedBlockComment)
}
}

/// Enum representing all of the result tokens in the Zirco lexer
///
/// Do not use `Tok::lexer` publicly. Use [`ZircoLexer`] instead.
#[derive(Logos, Debug, Clone, PartialEq, Eq)]
#[logos(
error = LexicalError,
error = InternalLexicalError,
skip r"[ \t\r\n\f]+", // whitespace
skip r"//[^\r\n]*(\r\n|\n)?", // single-line comments
// multi-line comments are handled by a callback: see handle_block_comment_start.
Expand Down Expand Up @@ -307,7 +333,7 @@ pub enum Tok {
/// Any string literal
#[regex(r#""([^"\\]|\\.)*""#, string_slice)]
#[regex(r#""([^"\\]|\\.)*"#, |lex| {
Err(LexicalError::UnterminatedStringLiteral(lex.span().start))
Err(InternalLexicalError::UnterminatedStringLiteral(lex.span().start))
})]
StringLiteral(String),
/// Any number literal
Expand Down Expand Up @@ -416,14 +442,16 @@ impl<'input> Iterator for ZircoLexer<'input> {
let span = self.lex.span();
let slice = self.lex.slice().to_string();
match token {
Err(LexicalError::NoMatchingRule) => {
Err(InternalLexicalError::NoMatchingRule) => {
let char = slice.chars().next().unwrap();
Some(Err(LexicalError::UnknownToken(
(span.start, char),
format!("Internal error: Unknown token '{char}'"),
)))
Some(Err(LexicalError::UnknownToken(span.start, char)))
}
Err(InternalLexicalError::UnterminatedBlockComment) => {
Some(Err(LexicalError::UnterminatedBlockComment))
}
Err(InternalLexicalError::UnterminatedStringLiteral(p)) => {
Some(Err(LexicalError::UnterminatedStringLiteral(p)))
}
Err(e) => Some(Err(e)),
Ok(token) => Some(Ok((span.start, token, span.end))),
}
}
Expand Down
7 changes: 1 addition & 6 deletions compiler/zrc_parser/src/lib.rs
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
//! Parser for the Zirco programming language
//!
//! This crate contains the lexer, parser, and abstract syntax tree
//! representation for the Zirco programming language. It is used by the
//! compiler to parse Zirco source code into an AST.
#![doc=include_str!("../README.md")]
#![warn(
clippy::cargo,
clippy::nursery,
Expand Down
125 changes: 87 additions & 38 deletions compiler/zrc_parser/src/parser.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,21 @@
//! Functions to parse Zirco into an Abstract Syntax Tree
//! Parsing and parser errors

Check failure on line 1 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
//!
//! This module contains thin wrappers around the generated parser for the Zirco programming language,
//! along with some additional models for error handling.
//!
//! In most cases, you will be using the [`parse_program`] function to parse some input code. In some
//! more specific situations, you may need to use [`parse_expr`] to parse a singular expression.
//!
//! # Error handling
//! The parser returns a [`Result`] that either yields the parsed [AST](super::ast) or a
//! [`ZircoParserError`]. For more information, read the documentation of [`ZircoParserError`].
//!
//! # Example
//! For more examples, read the documentation for the corresponding parser function.
//! ```
//! use zrc_parser::parser::parse_program;
//! let ast = parse_program("fn main() {}");
//! ```
use std::{
error::Error,
Expand All @@ -8,27 +25,40 @@ use std::{
use lalrpop_util::{ErrorRecovery, ParseError};

use super::{
ast::{
expr::Expr,
stmt::{Declaration, Stmt},
Spanned,
},
ast::{expr::Expr, stmt::Declaration, Spanned},
lexer,
};
use crate::internal_parser;

Check failure on line 32 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
/// An error returned from one of the Zirco parsing functions, like
/// [`parse_program`].
/// Representation of a parser error that may have returned a partial AST.
///
/// In the Zirco parser, we are capable of recovering from some errors. This means that the parser
/// may still return an AST in some cases even if a syntax error was encountered. This is useful
/// for IDEs and other tools that want to provide syntax highlighting or other features that
/// require at least a partial AST.
///
/// When you receive the [`ZircoParserError::Recoverable`] variant, you can access the
/// [`partial`](ZircoParserError::Recoverable::partial) field to obtain the partial AST. This AST
/// is "partial" in the way that it may contain some `Error` tokens, such as [`ExprKind::Error`](super::ast::expr::ExprKind::Error).
///
/// In the case of the [`ZircoParserError::Fatal`] variant, you cannot access a partial AST as none
/// was able to be recovered, and your application must just handle the corresponding LALRPOP [`ParseError`].
///
/// This type is often found wrapped in a [`Result`] type, where the [`Ok`] variant contains the
/// full AST and the [`Err`] variant contains a [`ZircoParserError`].
#[derive(Debug, PartialEq, Eq)]
pub enum ZircoParserError<T> {
/// An error we were able to recover from
/// The parser encountered an error, but was still able to produce a partial AST. This AST may
/// contain some `Error` tokens, such as [`ExprKind::Error`](super::ast::expr::ExprKind::Error).
Recoverable {
/// The list of recoverable errors
/// The list of [`ErrorRecovery`] instances corresponding with the errors that were encountered
/// during parsing.
errors: Vec<ErrorRecovery<usize, lexer::Tok, lexer::LexicalError>>,
/// The partial AST
/// The partial AST that was produced by the parser. This AST may contain some `Error` tokens,
/// such as [`ExprKind::Error`](super::ast::expr::ExprKind::Error).
partial: T,
},
/// An error that stopped the parser.
/// The parser encountered an error, and was unable to produce a partial AST.

Check failure on line 61 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
Fatal(ParseError<usize, lexer::Tok, lexer::LexicalError>),
}
impl<T: Debug> Error for ZircoParserError<T> {}
Expand All @@ -47,41 +77,60 @@ impl<T: Debug> Display for ZircoParserError<T> {
}
}

Check failure on line 79 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
/// More generic macro for [`parse_expr`] and [`parse_stmt`]
macro_rules! parse_internal {
($parser: ty, $input: expr) => {{
let mut errors = Vec::new();
let result = <$parser>::new().parse(&mut errors, lexer::ZircoLexer::new($input));
match result {
Err(e) => Err(ZircoParserError::Fatal(e)),
Ok(v) if errors.is_empty() => Ok(v),
Ok(v) => Err(ZircoParserError::Recoverable { errors, partial: v }),
}
}};
}

/// Parse a program with the Zirco parser.
/// Parses a Zirco program, yielding a list of [`Declaration`]s.
///
/// This function runs an **entire program** through the Zirco parser and returns either a complete
/// [AST](super::ast) consisting of root [`Declaration`] nodes, or a list of [`ZircoParserError`]s
/// in the case of a syntax error.
///
/// # Example
/// Obtaining the AST of a program:
/// ```
/// use zrc_parser::parser::parse_program;
/// let ast = parse_program("fn main() {}");
/// ```

Check failure on line 91 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
///
/// # Errors
/// Returns a valid [`ZircoParserError`] if parsing fails.
/// This function returns [`Err`] with a [`ZircoParserError`] if any error was encountered while
/// parsing the input program.
pub fn parse_program(
input: &str,
) -> Result<Vec<Spanned<Declaration>>, ZircoParserError<Vec<Spanned<Declaration>>>> {
parse_internal!(internal_parser::ProgramParser, input)
let mut errors = Vec::new();
let result =
internal_parser::ProgramParser::new().parse(&mut errors, lexer::ZircoLexer::new(input));

match result {
Err(e) => Err(ZircoParserError::Fatal(e)),
Ok(v) if errors.is_empty() => Ok(v),
Ok(v) => Err(ZircoParserError::Recoverable { errors, partial: v }),
}
}

Check failure on line 109 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
/// Parse a string as an expression with the Zirco parser.
/// Parses a singular Zirco expression, yielding an AST [`Expr`] node.
///
/// This function only parses a single Zirco [expression](Expr), and not an entire program. Unless
/// you are trying to do some special integration with partial programs, you probably want to use the
/// [`parse_program`] function instead.
///
/// # Example
/// Obtaining the AST of an expression:
/// ```
/// use zrc_parser::parser::parse_expr;
/// let ast = parse_expr("1 + 2");
/// ```

Check failure on line 121 in compiler/zrc_parser/src/parser.rs

View workflow job for this annotation

GitHub Actions / fmt

Diff in /home/runner/work/zrc/zrc/compiler/zrc_parser/src/parser.rs
///
/// # Errors
/// Returns a valid [`ZircoParserError`] if parsing fails.
/// This function returns [`Err`] with a [`ZircoParserError`] if any error was encountered while
/// parsing the input expression.
pub fn parse_expr(input: &str) -> Result<Expr, ZircoParserError<Expr>> {
parse_internal!(internal_parser::ExprParser, input)
}
let mut errors = Vec::new();
let result =
internal_parser::ExprParser::new().parse(&mut errors, lexer::ZircoLexer::new(input));

/// Parse a string as a statement with the Zirco parser.
///
/// # Errors
/// Returns a valid [`ZircoParserError`] if parsing fails.
pub fn parse_stmt(input: &str) -> Result<Stmt, ZircoParserError<Stmt>> {
parse_internal!(internal_parser::StmtParser, input)
match result {
Err(e) => Err(ZircoParserError::Fatal(e)),
Ok(v) if errors.is_empty() => Ok(v),
Ok(v) => Err(ZircoParserError::Recoverable { errors, partial: v }),
}
}

0 comments on commit 6f673b5

Please sign in to comment.