style: use 'nonterminal' consistently

* doc/bison.texi: Formatting changes. * src/gram.h, src/gram.c (nvars): Rename as... (nnterms): this. Adjust dependencies. (section): New. Use it. Replace "non terminal" and "non-terminal" by "nonterminal".
akimd · Jun 27, 2020 · 0895858 · 0895858
1 parent 4efb2f7
commit 0895858
Show file tree

Hide file tree

Showing 22 changed files with 111 additions and 99 deletions.
diff --git a/README-hacking.md b/README-hacking.md
@@ -37,6 +37,10 @@ Only user visible strings are to be translated: error messages, bits of the
 assert/abort), and all the --trace output which is meant for the maintainers
 only.
 
+## Vocabulary
+Use "nonterminal", not "variable" or "non-terminal" or "non terminal".
+Abbreviated as "nterm".
+
 ## Syntax highlighting
 It's quite nice to be in C++ mode when editing lalr1.cc for instance.
 However tools such as Emacs will be fooled by the fact that braces and

diff --git a/TODO b/TODO
@@ -91,7 +91,7 @@ generates tons of white space in the page, and may contribute to bad page
 breaks.
 
 ** consistency
-token vs terminal, variable vs non terminal.
+token vs terminal.
 
 ** api.token.raw
 The YYUNDEFTOK could be assigned a semantic value so that yyerror could be

diff --git a/doc/bison.texi b/doc/bison.texi
@@ -2834,13 +2834,13 @@ predefined variables such as @code{pi} or @code{e} as well.
 Add some new functions from @file{math.h} to the initialization list.
 
 @item
-Add another array that contains constants and their values.  Then
-modify @code{init_table} to add these constants to the symbol table.
-It will be easiest to give the constants type @code{VAR}.
+Add another array that contains constants and their values.  Then modify
+@code{init_table} to add these constants to the symbol table.  It will be
+easiest to give the constants type @code{VAR}.
 
 @item
-Make the program report an error if the user refers to an
-uninitialized variable in any way except to store a value in it.
+Make the program report an error if the user refers to an uninitialized
+variable in any way except to store a value in it.
 @end enumerate
 
 @node Grammar File
@@ -5513,12 +5513,12 @@ do @{
 yypstate_delete (ps);
 @end example
 
-If the user decided to use an impure push parser, a few things about
-the generated parser will change.  The @code{yychar} variable becomes
-a global variable instead of a variable in the @code{yypush_parse} function.
-For this reason, the signature of the @code{yypush_parse} function is
-changed to remove the token as a parameter.  A nonreentrant push parser
-example would thus look like this:
+If the user decided to use an impure push parser, a few things about the
+generated parser will change.  The @code{yychar} variable becomes a global
+variable instead of a local one in the @code{yypush_parse} function.  For
+this reason, the signature of the @code{yypush_parse} function is changed to
+remove the token as a parameter.  A nonreentrant push parser example would
+thus look like this:
 
 @example
 extern int yychar;
@@ -8104,10 +8104,9 @@ doing so would produce on the stack the sequence of symbols @code{expr
 @vindex yychar
 @vindex yylval
 @vindex yylloc
-The lookahead token is stored in the variable @code{yychar}.
-Its semantic value and location, if any, are stored in the variables
-@code{yylval} and @code{yylloc}.
-@xref{Action Features}.
+The lookahead token is stored in the variable @code{yychar}.  Its semantic
+value and location, if any, are stored in the variables @code{yylval} and
+@code{yylloc}.  @xref{Action Features}.
 
 @node Shift/Reduce
 @section Shift/Reduce Conflicts
@@ -14263,14 +14262,13 @@ start:
 These tokens prevents the introduction of new conflicts.  As far as the
 parser goes, that is all that is needed.
 
-Now the difficult part is ensuring that the scanner will send these
-tokens first.  If your scanner is hand-written, that should be
-straightforward.  If your scanner is generated by Lex, them there is
-simple means to do it: recall that anything between @samp{%@{ ... %@}}
-after the first @code{%%} is copied verbatim in the top of the generated
-@code{yylex} function.  Make sure a variable @code{start_token} is
-available in the scanner (e.g., a global variable or using
-@code{%lex-param} etc.), and use the following:
+Now the difficult part is ensuring that the scanner will send these tokens
+first.  If your scanner is hand-written, that should be straightforward.  If
+your scanner is generated by Lex, them there is simple means to do it:
+recall that anything between @samp{%@{ ... %@}} after the first @code{%%} is
+copied verbatim in the top of the generated @code{yylex} function.  Make
+sure a variable @code{start_token} is available in the scanner (e.g., a
+global variable or using @code{%lex-param} etc.), and use the following:
 
 @example
   /* @r{Prologue.} */

diff --git a/src/closure.c b/src/closure.c
@@ -104,21 +104,21 @@ print_fderives (void)
   fprintf (stderr, "\n\n");
 }
 
-/*------------------------------------------------------------------.
-| Set FIRSTS to be an NVARS array of NVARS bitsets indicating which |
-| items can represent the beginning of the input corresponding to   |
-| which other items.                                                |
-|                                                                   |
-| For example, if some rule expands symbol 5 into the sequence of   |
-| symbols 8 3 20, the symbol 8 can be the beginning of the data for |
-| symbol 5, so the bit [8 - ntokens] in first[5 - ntokens] (= FIRST |
-| (5)) is set.                                                      |
-`------------------------------------------------------------------*/
+/*-------------------------------------------------------------------.
+| Set FIRSTS to be an NNTERMS array of NNTERMS bitsets indicating    |
+| which items can represent the beginning of the input corresponding |
+| to which other items.                                              |
+|                                                                    |
+| For example, if some rule expands symbol 5 into the sequence of    |
+| symbols 8 3 20, the symbol 8 can be the beginning of the data for  |
+| symbol 5, so the bit [8 - ntokens] in first[5 - ntokens] (= FIRST  |
+| (5)) is set.                                                       |
+`-------------------------------------------------------------------*/
 
 static void
 set_firsts (void)
 {
-  firsts = bitsetv_create (nvars, nvars, BITSET_FIXED);
+  firsts = bitsetv_create (nnterms, nnterms, BITSET_FIXED);
 
   for (symbol_number i = ntokens; i < nsyms; ++i)
     for (symbol_number j = 0; derives[i - ntokens][j]; ++j)
@@ -139,8 +139,8 @@ set_firsts (void)
 }
 
 /*-------------------------------------------------------------------.
-| Set FDERIVES to an NVARS by NRULES matrix of bits indicating which |
-| rules can help derive the beginning of the data for each           |
+| Set FDERIVES to an NNTERMS by NRULES matrix of bits indicating     |
+| which rules can help derive the beginning of the data for each     |
 | nonterminal.                                                       |
 |                                                                    |
 | For example, if symbol 5 can be derived as the sequence of symbols |
@@ -151,7 +151,7 @@ set_firsts (void)
 static void
 set_fderives (void)
 {
-  fderives = bitsetv_create (nvars, nrules, BITSET_FIXED);
+  fderives = bitsetv_create (nnterms, nrules, BITSET_FIXED);
 
   set_firsts ();
 

diff --git a/src/counterexample.c b/src/counterexample.c
@@ -177,9 +177,9 @@ si_bfs_free (si_bfs_node *n)
 
 /**
  * start is a state_item such that conflict_sym is an element of FIRSTS of the
- * non-terminal after the dot in start. Because of this, we should be able to
+ * nonterminal after the dot in start. Because of this, we should be able to
  * find a production item starting with conflict_sym by only searching productions
- * of the non-terminal and shifting over nullable non-terminals
+ * of the nonterminal and shifting over nullable nonterminals
  *
  * this returns the derivation of the productions that lead to conflict_sym
  */
@@ -292,7 +292,7 @@ complete_diverging_example (symbol_number conflict_sym,
   // We go backwards through the path to create the derivation tree bottom-up.
   // Effectively this loops through each production once, and generates a
   // derivation of the left hand side by appending all of the rhs symbols.
-  // this becomes the derivation of the non-terminal after the dot in the
+  // this becomes the derivation of the nonterminal after the dot in the
   // next production, and all of the other symbols of the rule are added as normal.
   for (gl_list_node_t state_node = list_get_end (path);
        state_node != NULL;
@@ -334,8 +334,8 @@ complete_diverging_example (symbol_number conflict_sym,
           // Since reductions have the dot at the end of the item,
           // this loop will be first executed on the last item in the path
           // that's not a reduction. When that happens,
-          // the symbol after the dot should be a non-terminal,
-          // and we can look through successive nullable non-terminals
+          // the symbol after the dot should be a nonterminal,
+          // and we can look through successive nullable nonterminals
           // for one with the conflict symbol in its first set.
           if (bitset_test (FIRSTS (sym), conflict_sym))
             {

diff --git a/src/derivation.h b/src/derivation.h
@@ -25,10 +25,10 @@
 
 # include "gram.h"
 
-/* Derivations are trees of symbols such that each non terminal's
+/* Derivations are trees of symbols such that each nonterminal's
    children are symbols that produce that nonterminal if they are
-   relevant to the counterexample. The leaves of a derivation form a
-   counterexample when printed. */
+   relevant to the counterexample.  The leaves of a derivation form a
+   counterexample when printed.  */
 
 typedef gl_list_t derivation_list;
 typedef struct derivation derivation;

diff --git a/src/derives.c b/src/derives.c
@@ -62,7 +62,7 @@ derives_compute (void)
 {
   /* DSET[NTERM - NTOKENS] -- A linked list of the numbers of the rules
      whose LHS is NTERM.  */
-  rule_list **dset = xcalloc (nvars, sizeof *dset);
+  rule_list **dset = xcalloc (nnterms, sizeof *dset);
 
   /* DELTS[RULE] -- There are NRULES rule number to attach to nterms.
      Instead of performing NRULES allocations for each, have an array
@@ -82,9 +82,9 @@ derives_compute (void)
   /* DSET contains what we need under the form of a linked list.  Make
      it a single array.  */
 
-  derives = xnmalloc (nvars, sizeof *derives);
+  derives = xnmalloc (nnterms, sizeof *derives);
   /* Q is the storage for DERIVES[...] (DERIVES[0] = q).  */
-  rule **q = xnmalloc (nvars + nrules, sizeof *q);
+  rule **q = xnmalloc (nnterms + nrules, sizeof *q);
 
   for (symbol_number i = ntokens; i < nsyms; ++i)
     {

diff --git a/src/gram.c b/src/gram.c
@@ -40,7 +40,7 @@ rule_number nrules = 0;
 symbol **symbols = NULL;
 int nsyms = 0;
 int ntokens = 1;
-int nvars = 0;
+int nnterms = 0;
 
 symbol_number *token_translations = NULL;
 
@@ -192,10 +192,10 @@ grammar_rules_partial_print (FILE *out, const char *title,
       if (first)
         fprintf (out, "%s\n\n", title);
       else if (previous_rule && previous_rule->lhs != rules[r].lhs)
-        fputc ('\n', out);
+        putc ('\n', out);
       first = false;
       rule_print (&rules[r], previous_rule, out);
-      fputc ('\n', out);
+      putc ('\n', out);
       previous_rule = &rules[r];
     }
   if (!first)
@@ -241,15 +241,25 @@ grammar_rules_print_xml (FILE *out, int level)
    xml_puts (out, level + 1, "<rules/>");
 }
 
+static void
+section (FILE *out, const char *s)
+{
+  fprintf (out, "%s\n", s);
+  for (int i = strlen (s); 0 < i; --i)
+    putc ('-', out);
+  putc ('\n', out);
+  putc ('\n', out);
+}
+
 void
 grammar_dump (FILE *out, const char *title)
 {
   fprintf (out, "%s\n\n", title);
   fprintf (out,
-           "ntokens = %d, nvars = %d, nsyms = %d, nrules = %d, nritems = %d\n\n",
-           ntokens, nvars, nsyms, nrules, nritems);
+           "ntokens = %d, nnterms = %d, nsyms = %d, nrules = %d, nritems = %d\n\n",
+           ntokens, nnterms, nsyms, nrules, nritems);
 
-  fprintf (out, "Tokens\n------\n\n");
+  section (out, "Tokens");
   {
     fprintf (out, "Value  Sprec  Sassoc  Tag\n");
 
@@ -261,7 +271,7 @@ grammar_dump (FILE *out, const char *title)
     fprintf (out, "\n\n");
   }
 
-  fprintf (out, "Non terminals\n-------------\n\n");
+  section (out, "Nonterminals");
   {
     fprintf (out, "Value  Tag\n");
 
@@ -271,7 +281,7 @@ grammar_dump (FILE *out, const char *title)
     fprintf (out, "\n\n");
   }
 
-  fprintf (out, "Rules\n-----\n\n");
+  section (out, "Rules");
   {
     fprintf (out,
              "Num (Prec, Assoc, Useful, UselessChain) Lhs"
@@ -293,17 +303,17 @@ grammar_dump (FILE *out, const char *title)
         /* Dumped the RHS. */
         for (item_number *rhsp = rule_i->rhs; 0 <= *rhsp; ++rhsp)
           fprintf (out, " %3d", *rhsp);
-        fputc ('\n', out);
+        putc ('\n', out);
       }
   }
   fprintf (out, "\n\n");
 
-  fprintf (out, "Rules interpreted\n-----------------\n\n");
+  section (out, "Rules interpreted");
   for (rule_number r = 0; r < nrules + nuseless_productions; ++r)
     {
       fprintf (out, "%-5d  %s:", r, rules[r].lhs->symbol->tag);
       rule_rhs_print (&rules[r], out);
-      fputc ('\n', out);
+      putc ('\n', out);
     }
   fprintf (out, "\n\n");
 }

diff --git a/src/gram.h b/src/gram.h
@@ -23,9 +23,9 @@
 
 /* Representation of the grammar rules:
 
-   NTOKENS is the number of tokens, and NVARS is the number of
+   NTOKENS is the number of tokens, and NNTERMS is the number of
    variables (nonterminals).  NSYMS is the total number, ntokens +
-   nvars.
+   nnterms.
 
    Each symbol (either token or variable) receives a symbol number.
    Numbers 0 to NTOKENS - 1 are for tokens, and NTOKENS to NSYMS - 1
@@ -113,7 +113,7 @@
 
 extern int nsyms;
 extern int ntokens;
-extern int nvars;
+extern int nnterms;
 
 /* Elements of ritem. */
 typedef int item_number;

diff --git a/src/lalr.c b/src/lalr.c
@@ -99,7 +99,7 @@ void
 set_goto_map (void)
 {
   /* Count the number of gotos (ngotos) per nterm (goto_map). */
-  goto_map = xcalloc (nvars + 1, sizeof *goto_map);
+  goto_map = xcalloc (nnterms + 1, sizeof *goto_map);
   ngotos = 0;
   for (state_number s = 0; s < nstates; ++s)
     {
@@ -113,7 +113,7 @@ set_goto_map (void)
         }
     }
 
-  goto_number *temp_map = xnmalloc (nvars + 1, sizeof *temp_map);
+  goto_number *temp_map = xnmalloc (nnterms + 1, sizeof *temp_map);
   {
     goto_number k = 0;
     for (symbol_number i = ntokens; i < nsyms; ++i)
@@ -583,7 +583,7 @@ lalr_update_state_numbers (state_number old_to_new[], state_number nstates_old)
 {
   goto_number ngotos_reachable = 0;
   symbol_number nonterminal = 0;
-  aver (nsyms == nvars + ntokens);
+  aver (nsyms == nnterms + ntokens);
 
   for (goto_number i = 0; i < ngotos; ++i)
     {
@@ -601,7 +601,7 @@ lalr_update_state_numbers (state_number old_to_new[], state_number nstates_old)
           ++ngotos_reachable;
         }
     }
-  while (nonterminal <= nvars)
+  while (nonterminal <= nnterms)
     {
       aver (ngotos == goto_map[nonterminal]);
       goto_map[nonterminal++] = ngotos_reachable;

diff --git a/src/lr0.c b/src/lr0.c
@@ -86,7 +86,7 @@ state_list_append (symbol_number sym, size_t core_size, item_index *core)
   return res;
 }
 
-/* Symbols that can be "shifted" (including non terminals) from the
+/* Symbols that can be "shifted" (including nonterminals) from the
    current state.  */
 bitset shift_symbol;
 

diff --git a/src/lssi.c b/src/lssi.c
@@ -186,7 +186,7 @@ shortest_path_from_start (state_item_number target, symbol_number next_sym)
             }
         }
       // For production steps, follow_L is based on the symbol after the
-      // non-terminal being produced.
+      // nonterminal being produced.
       // if no such symbol exists, follow_L is unchanged
       // if the symbol is a terminal, follow_L only contains that terminal
       // if the symbol is not nullable, follow_L is its FIRSTS set