-
Notifications
You must be signed in to change notification settings - Fork 1
Implementation
A token is a substring within that has an influence on the expression it occurs within. For example, in the expression x + y * 2
, the tokens are x
, +
, y
, *
, and 2
. Whitespace is not considered a token as it has no impact on the expression. This method of composing tokens gives the writer the liberty to format an expression as they desire. Below is a table of all tokens within arithmexp
.
Token Name | Token Symbols |
---|---|
Binary Operator |
+ / ** % * -
|
Numeric Literal |
[0-9] .
|
Parenthesis Left | ( |
Parenthesis Right | ) |
Unary Operator |
+ -
|
Variable |
[A-Z] [a-z] [0-9] _
|
The Scanner
class takes the role of constructing an array of tokens from a given expression string (in other words, the Scanner performs tokenization). The Scanner
consists of an array of token builders such as a numeric literal token builder (which scans for 1
, 2.34
, etc.), a parenthesis token builder (which scans for (
, and )
), etc. Tokenization is the very first stage of parsing.
The parser spends most of its work in converting a given expression from infix notation (x + y
) to postfix notation (x y +
).
In infix notation, binary operators occur in-between their operands (this method of writing expressions is the one most of us are familiar with). For example, in the infix expression 2 + 3
, the operator +
occurs in between the left operand 2
and the right operand 3
. In postfix notation, binary operators occur after their operands. For example, in the expression 2 3 +
, the operator +
is placed after the left operand 2
and the right operand 3
. This notation is commonly used internally in computing for a few reasons.
In the postfix expression 2 3 add
, the operand 2
is evaluated first, followed by the operand 3
, finally followed by the operator add
which executes the operation of addition. In contrast to the infix notation 2 add 3
, or even the prefix notation add(2, 3)
, the postfix notation accurately describes the order of operations being performed, leaving no room for uncertainty. Furthermore, parentheses, operator precedence, and operator associativity are not required during the evaluation of a postfix expression. This shifts the computational cost of performing such operations to the parser rather than the evaluator. The expression 2 + 3 * 4
in postfix notation with *
taking precedence over +
can be written as 2 3 4 * +
. The expression (2 + 3) * 4
would be 2 3 + 4 *
in postfix notation. Postfix converts a binary tree structure such as ((2 * 3) / (4 + 2)) - 7
to a linear array structure 2 3 * 4 2 + / 7 -
, allowing low-memory and high-performant evaluation during runtime using stack operations (see stack-oriented programming).
Try out arithmexp
on the demo site!
1.1. Installation
2.1. Constant
2.2. Function
2.3. Macro
2.4. Operator
2.5. Variable
2.6. Expression Optimization
2.7. Implementation Internals