Rapid prototyping parser generator.
Follows list of parser generator features.
- Generates parser lexical and syntactical analyzers based on input file.
- Lexical symbols are described by regular expressions.
- Language grammar is described by SLR(1) grammar.
- Generated parser can be immediately tested on source string.
- Semantic rules of language can be tested by Lua scripts, that are binded to each rule reduction.
- Debugged and fine tuned parser can be generated in form of C/C++ code.
Need for fast parser generation and testing.
Examples of rule files used to generate parsers are placed in directory:
build/rules
Follows few simple examples demonstrating yapgen possibilities.
Examples of basic regular expressions.
oct_int {'0'.<07>*}
dec_int {<19>.d*}
hex_int {'0'.[xX].(<09>+<af>+<AF>).(<09>+<af>+<AF>)*}
if {"if"}
else {"else"}
equal {'='}
plus_equal {"+="}
minus_equal {"-="}
comment_sl {"//".!'\n'*.'\n'}
comment_ml {"/*".(!'*'+('*'.!'/'))*."*/"}
Regular expressions can be used to recognize binary data.
PACKET_ADDRESS {"/?".(<09>+<az>+<AZ>)*.'!'."\x0d\x0a"}
PACKET_IDENTIFY {'/'.<AZ>.<AZ>.(<AZ>+<az>).<09>.(|/!\x0d|)*."\x0d\x0a"}
PACKET_ACK_COMMAND {'\x06'.<09>.(<09>+<az>).<09>."\x0d\x0a"}
PACKET_ACK {'\x06'}
Example of basic grammar rules. Identifiers closed in angle (sharp) brackets
e.g. <command>
identifies nonterminal symbols of grammar, and identifiers
without brackets e.g. if
refers to terminal symbols described by regular
expressions.
<command> -> if <condition> <if_else> ->> {}
<if_else> -> <command> ->> {}
<if_else> -> <command> else <command> ->> {}
<command> -> <while_begin> <condition> <command> ->> {}
<while_begin> -> while ->> {}
Grammar rules can have semantic code binded to them.
<F> -> <F> double_equal <E> ->>
{
if gen_parse_tree == 1 then
this_idx = node_idx;
node_idx = node_idx + 1;
print(" node_"..this_idx.." [label = \"<exp> == <exp>\"]");
print(" node_"..this_idx.." -> node_"..table.remove(node_stack).."");
print(" node_"..this_idx.." -> node_"..table.remove(node_stack).."");
table.insert(node_stack,this_idx);
else
print(table.concat(tabs,"").."operator binary double_equal");
end
}
Follows example of complete parser rules file.
init_code: {s = {\};}
terminals:
oct_int_const {'0'.<07>*}
dec_int_const {<19>.d*}
hex_int_const {'0'.[xX].(<09>+<af>+<AF>).(<09>+<af>+<AF>)*}
lr_br {'('}
rr_br {')'}
plus {'+'}
minus {'-'}
asterisk {'*'}
slash {'/'}
percent {'%'}
_SKIP_ {w.w*}
_END_ {'\0'}
nonterminals:
<start> <exp> <C> <B> <A>
rules:
<start> -> <exp> _END_ ->> {}
<exp> -> <C> ->> {print("result: "..s[#s]);}
<C> -> <C> plus <B> ->> {s[#s-1] = s[#s-1] + table.remove(s);}
<C> -> <C> minus <B> ->> {s[#s-1] = s[#s-1] - table.remove(s);}
<C> -> <B> ->> {}
<B> -> <B> asterisk <A> ->> {s[#s-1] = s[#s-1] * table.remove(s);}
<B> -> <B> slash <A> ->> {s[#s-1] = s[#s-1] / table.remove(s);}
<B> -> <B> percent <A> ->> {s[#s-1] = s[#s-1] % table.remove(s);}
<B> -> <A> ->> {}
<A> -> lr_br <C> rr_br ->> {}
<A> -> oct_int_const ->> {table.insert(s,tonumber(rule_body(0),8));}
<A> -> dec_int_const ->> {table.insert(s,tonumber(rule_body(0),10));}
<A> -> hex_int_const ->> {table.insert(s,tonumber(rule_body(0)));}
Parser generated from presented rule string will generate following result for following input string.
5*(10 + 5) - 0x10
result: 59
Programming language Lua of version 5.2 or greater is required for yapgen compilation.
The container generator cont
is needed
for compilation of parser generator.
Enter build directory build
.
cd build
Process cmake source.
cmake ..
Build yapgen.
make -j$(nproc)
Example parsers are located in directory
build/rules
.