Skip to content

Latest commit

 

History

History
141 lines (112 loc) · 5.29 KB

bnf.md

File metadata and controls

141 lines (112 loc) · 5.29 KB

THE BNF META-PARSER

Under this freaky name, there's something realy cool. Alphyn can be used to parse a BNF-like grammar and generate a parser. The advantage, is that alphyn can then generate itself all the lexer and parser stuff leaving only the duty of creating the attributes and the BNF string. That's all.

Here is an example (always the math expression evaluator):

#include <alphyn.hpp>
#include <bnf.hpp>

The BNF meta-parser is not included by default (it is a non mandatory part of alphyn) so if you plan to use it, you have to include it.

You then have to declare a structure or a class that will hold the definition of the grammar and the attributes

struct math_eval_bnf
{

This is not mandatory, but we will need this for the mathematical expression evaluator

  using return_type = float;
  using token_type = neam::ct::alphyn::bnf::token_type; // the token type generated by the bnf meta-parser

Then comes the names of the different attributes (must be a constexpr string). They will be used in the BNF grammar to define attributes.

  static constexpr neam::string_t add_str = "add";
  static constexpr neam::string_t sub_str = "sub";
  static constexpr neam::string_t mul_str = "mul";
  static constexpr neam::string_t div_str = "div";
  static constexpr neam::string_t atof_str = "atof";
  static constexpr neam::string_t forward0_str = "forward:0";
  static constexpr neam::string_t forward1_str = "forward:1";

As we are using function attributes, we must declare and implement the functions somewhere:

  static constexpr return_type attr_add(return_type n1, const token_type &, return_type n2) { return n1 + n2; }
  static constexpr return_type attr_sub(return_type n1, const token_type &, return_type n2) { return n1 - n2; }
  static constexpr return_type attr_mul(return_type n1, const token_type &, return_type n2) { return n1 * n2; }
  static constexpr return_type attr_div(return_type n1, const token_type &, return_type n2) { return n1 / n2; }
  static constexpr return_type attr_atof(const token_type &tok)
  {
    // a very simple atof
    float value = 0;
    size_t i = tok.start_index;
    for (; tok.s[i] != '.' && i < tok.end_index; ++i)
      value = value * 10 + (tok.s[i] - '0');
    if (tok.s[i] == '.')
    {
      float cpow = 10;
      for (size_t j = i + 1; j < tok.end_index; ++j)
      {
        value += float(tok.s[j] - '0') / cpow;
        cpow *= 10;
      }
    }

    return value;
  }

Next is a construct that allow alphyn to link a name to an attribute type. The attribute type accepts all that a normal grammar would accept with the exception that neam::ct::alphyn::forward_value_attribute<X> will always forward an int of value 0.

  using attributes = neam::ct::alphyn::bnf::attribute_db<
    neam::ct::alphyn::bnf::attribute_db_entry<add_str, ALPHYN_ATTRIBUTE(&attr_add)>,
    neam::ct::alphyn::bnf::attribute_db_entry<sub_str, ALPHYN_ATTRIBUTE(&attr_sub)>,
    neam::ct::alphyn::bnf::attribute_db_entry<mul_str, ALPHYN_ATTRIBUTE(&attr_mul)>,
    neam::ct::alphyn::bnf::attribute_db_entry<div_str, ALPHYN_ATTRIBUTE(&attr_div)>,
    neam::ct::alphyn::bnf::attribute_db_entry<atof_str, ALPHYN_ATTRIBUTE(&attr_atof)>,
    neam::ct::alphyn::bnf::attribute_db_entry<forward0_str, neam::ct::alphyn::forward_attribute<0>>,
    neam::ct::alphyn::bnf::attribute_db_entry<forward1_str, neam::ct::alphyn::forward_attribute<1>>
  >;

And now the BNF-like grammar. It has been extended to support attributes and regexps.

  • A regexp is defined with regexp: followed by a terminal (a string between ' or ").
  • A terminal is delimited by either ' or ", everything else are non-terminals
  • A production is ended by ;
  • It suffers from the same limitations as alphyn: the start production must have only one alternative and each rule must have an attribute
  • The initial rule is the first one, whatever its name is
  static constexpr neam::string_t bnf_grammar = R"(
    start ::= sum regexp:'$'              [forward:0];

    sum   ::= prod                        [forward:0]
            | prod '+' sum                [add]
            | prod '-' sum                [sub];

    prod  ::= val                         [forward:0]
            | val '*' prod                [mul]
            | val '/' prod                [div];

    val   ::= regexp:'[0-9]+(\.[0-9]*)?'  [atof]
            | '(' sum ')'                 [forward:1];
  )";
};

To get your parser from the class above, you have to do:

using math_eval = neam::ct::alphyn::bnf::generate_parser<math_eval_bnf>::parser;

and

float result = math_eval::parse_string<float>("1 + 1");

or (if return_type is not float)

float result = math_eval::ct_parse_string<my_string>;

For more details about attributes and the parser see here

NOTE

Because of the BNF parser is a meta-parser (a parser that generates a parser), and because of the implementation of alphyn if you have any error message in your code it will probably be flooded within a trace containing HUGE types (the record is a single type of 40Mo). But as you mostly only have to look at the error below the trace, (and don't care about the trace) you may want to cut things. So I put this here, just in case.

make |& cut -c 1-250

It will truncate the output to at most 250 char per line.