Skip to content

Latest commit

 

History

History
178 lines (103 loc) · 6.37 KB

README.md

File metadata and controls

178 lines (103 loc) · 6.37 KB

Developer Guide

Yorlang is an interpreted programming language built on NodeJS.

When you run yorl <file>.yl in your terminal, the yorl command is handled by the cli.js file, which uses commander to start the Yorlang interpreter, which performs the following actions:

InputStream

The file path is passed into the InputStream instance, which reads the file's contents as a string.

An InputStream contains methods such as next, peek and isEndOfFile.

- next()

It maintains a cursor position which begins at the beginning of the file, and everytime next() is called, the character at the position of the cursor is returned, and the cursor moved one-char forward.


How Yorlang's InputStream's next() works

This repl shows how InputStream's next() works.

- peek()

Rather than update the cursor position, the peek() function returns the character at the position of the cursor, and keeps its position the same.

- isEndOfFile()

This returns a boolean value when the cursor has reached the end of the input file.


How Yorlang's InputStream's peek() and isEndOfFile() work

This repl shows how InputStream's peek() and isEndOfFile() work.

Lexical Analyser

The Lexer instance accepts an InputStream as an argument, and attempts to read each character, and recognise token such as identifiers, operators, keywords, and numbers.

A Lexer contains methods such as:

- readWhile(predicate)

This will retrieve the next characters from the InputStream until a character is found that does not match the condition given by a predicate function.

readWhile((c) => c != ";") will read characters until either one of them matches ";", or the end of the file is reached.

- readNext()

This makes sure to skip whitespaces and comments, and depending on the nature of a character read, it calls and returns the value of other methods such as:

- readString()

This is called by readNext() when a quote character is found. It will attempt to read the characters until a closing quote is found. It returns a token object like:

{ "type": "string", "value": "example string" }

- readNumber()

This is called by readNext() when a digit is found. It will attempt to read a valid number (including dots), until no more digits can be found. It returns a token object like:

{ "type": "number", "value": 0.123 }

- readIdentifier()

This is called by readNext() when a character is found that matches the constants.REGEX.IDENTIFIER regex pattern. It will attempt to read subsequent characters that match the pattern, till either one is found that does not, or the end of file is reached. It returns a token object like:

{ "type": "keyword", "value": "jeki" }

or

{ "type": "variable", "value": "foo" }

- isPunctuation()

This determines whether a character read is a punctuation. The list of punctuations can be found in constants.LIST.PUNCTUATIONS.

When readNext() detects a punctuation, it returns a token object like:

{ "type": "punctuation", "value": "{" }

- isOperator()

This determines whether a character read is an operator. The list of operators can be found in constants.LIST.OPERATORS.

When readNext() detects an operator, it returns a token object like:

{ "type": "operator", "value": "+" }

- next ()

This returns the token returned by the readNext() function.

- peek ()

This returns the current token, while preventing the InputStream from advancing.


The Output Tokens of Yorlang's Lexer

This repl shows the output tokens of the Lexer.

Parser

The Parser instance accepts a Lexer as an argument, and uses the recursive descent parsing technique with backtracking, to read each token, handle operator precendence, and build an abstract syntax tree.

To do this, the Parser makes use of Parser Nodes. Each Parser Node contains logic for ensuring grammar expectations for a particular Yorlang construct are met.

A Yorlang construct can be either a keyword such as jeki and sope, or a node literal such an array or bracket expression.

- getNode()

Each Parser Node instance has a getNode() method, which contains logic for ensuring the keyword's or literal's grammar is correct.

Here's an example of the grammar for jeki:

jeki<whitespace><identifier>=<expression>;

such as

jeki name = "Yorlang";

The getNode() function output for the above code returns:

{
  "operation": "=",
  "left": "name",
  "right": {
    "value": "Yorlang",
    "left": null,
    "right": null,
    "operation": null
  }
}

Notice how the right and left properties indicate a tree structure? This is an Abstract Syntax Tree.


The AST Output of Yorlang's Parser

This repl shows the Abstract Syntax Tree outputs of the Parser.


See list of other available parser nodes.

The Parser instance contains methods such as:

- parseWhile(list, fn)

This function takes a list of operators, and a function containing another parseWhile function as arguments, creating a recursive flow.

It will attempt to evaluate the function argument, before performing its own evaluation, which checks that the value of the next token from the lexer, can be found in its operator list argument.

If found, it'll return an object with properties left, operation, right and value.

- parseExpression