Rewrite the parser #62

PhilippImhof · 2022-11-30T09:39:35Z

The current parser goes back to the very first version. It is entirely based on regular expressions. However, these can only partially parse languages. This has several consequences:

The current code is rather complicated and difficult to maintain, let alone to extend, especially for possible future contributors.
Currently, we have the following limitations:
- Strings must be delimited by double quotes only. They cannot contain line breaks and they cannot contain a double quote, because escaping is not supported.
- Lists can only contain one single data type, e.g. all numbers or all strings.
- Lists cannot be nested and thus we cannot have multi-dimensional arrays, e.g. for matrices.
There is only one "namespace". The Formulas Question plugin only defines functions (even for constants like pi) and users can only define variables. However, variable names must be different from function names. As a consequence, every new function we define could potentially break existing questions. The validation makes sure one cannot define a variable sin. But let's say a Spanish speaking user defines a variable sen (to store some sine value). This will work, because there is no such function at the moment. Now, if we add the function sen in a month, the question will stop working because of the conflicting name.
The current code uses eval() which is considered bad practice, even if it's done very cautiously.

The aim of this PR is to completely rewrite the existing parser. In the end, we should be able to

have strings delimited by single or double quotes ✓
have line breaks or multi-line strings ✓
include the delimiting quote in a string thanks to escaping ✓
access chars from a string like we do with array elements ✓
access array elements (and chars from a string) "from the end" with negative indices, as we can do for strings in PHP ✓
have lists with mixed data types and nested lists, thus allowing to develop matrices ✓
broader usage of ranges, e.g. use variables in ranges ✓ or use ranges and elements side-by-side in an array (this is already possible in sets) ✓
allow the user to "overwrite" identifiers, i.e. to have a variable name that is already used for a function, without losing the ability to call that function ✓
avoid the usage of eval() ✓
use pi or π instead of pi(); the old syntax will remain valid for backwards compatibility ✓
use strings in a ternary expression, e.g. (a == b ? "yes" : "no") as an easier and more readable alternative to pick()✓
use + to concatenate (join) strings as a short alternative to join() ✓
use == comparison for strings ✓

Also, the new parser should make it easier to locate possible syntax errors in a definition, e.g. better indicate the row and column number.

PhilippImhof · 2023-02-26T08:46:12Z

Finally found some time to take this a step further. Tokenizer is fully functional. It is still work in progress, many things are missing and implementations are likely to change.

Next steps:

function calls (like sin() etc.) inside expressions
ternary operator
arrays
for loop

PhilippImhof · 2023-03-26T12:26:32Z

Parser is now more or less finished. There's some refactoring and some fine tuning left and there will probably be a few bugs.

Next step: evaluation. Currently, there is virtually no evaluation in the code, because everything is sent to eval(). Once the rewrite is done, evaluation will be handled inside the plugin and we can finally get rid of eval().

PhilippImhof · 2023-03-29T11:44:16Z

Update: Evaluation of expressions is basically working, with a few things left to be done, namely all the variable stuff (replace variable by its content at the right time and assign new values to variables). Also, evaluation (not parsing!) of for loops is not done yet. These are the next steps.

PhilippImhof · 2023-04-15T09:44:54Z

Variables are now basically working. What remains to be done:

evaluation of for loops
assignment of a value to an individual array element
definition and instantiation of random variables

PhilippImhof · 2023-04-22T08:36:43Z

PhilippImhof · 2023-07-25T13:34:25Z

I have started the integration of the new parser. That's going to take a while. Please note:

From now on, some tests will be failing

Some tests depend on legacy code that might have been removed or changed. These tests will have to be updated once the integration is done. Until then, they will fail.

PhilippImhof · 2024-12-23T13:56:32Z

It is now possible for teachers to "deactivate" built-in functions for the student's response by overwriting them as variables. As the teacher can still access those overwritten functions by using the PREFIX operator, they can use them in their model answers, but students cannot.

Example: Question asking the student to give an equivalent of (sin(x))^2, the student could simply type the same formula and would get full marks unless the teacher spots this by looking at the answers. By defining the global "variable" sin = 9, the student's answer (sin(x))^2 would be read as (9*x)^2 and would thus be wrong. The student has to write 1 - (cos(x))^2.

For the docs: Teachers SHOULD NOT use the prefix operator in the model answer for an algebraic formula, because the model answer is used for the "The correct answer is …" feedback and when the student clicks "Fill in correct responses". Using the prefix in the model answer will then show an answer that the student cannot successfully submit. This is not a problem for the other answer types, because the result will be numerically evaluated.

In the example above, the teacher should thus not write (\sin(x))^2 in the model answer, but 1 - (cos(x))^2, as they expect from the student.

PhilippImhof · 2024-12-23T14:23:51Z

Thinking of it again, I will make sure that using the PREFIX in an algebraic formula's model answer will give an appropriate error message, I think it is not good to have the docs say that one should not do something, but allow them to do it anyway…

PhilippImhof self-assigned this Nov 30, 2022

PhilippImhof added enhancement refactoring refactoring of existing code labels Nov 30, 2022

PhilippImhof force-pushed the parser branch from 0e6cda7 to 0f598af Compare February 26, 2023 08:43

PhilippImhof force-pushed the parser branch 4 times, most recently from de0df5f to 0902aa3 Compare March 6, 2023 10:53

PhilippImhof force-pushed the parser branch from 2f1aad1 to a6057fb Compare March 29, 2023 10:51

PhilippImhof force-pushed the parser branch from 896d9e9 to 4332c08 Compare April 16, 2023 19:15

PhilippImhof mentioned this pull request May 24, 2023

Images get lost when moving question to other context #98

Open

PhilippImhof added this to the 6.0.0 milestone Aug 15, 2023

PhilippImhof added 12 commits October 12, 2023 17:29

first steps to replace the old parser

400c795

lexing, parsing (partial), shunting yard

a94dffa

update include()

478ec2d

separate token types for different parentheses

107302d

progress on functions and ternary operators

40428e6

progress on arrays

52ff940

implement PREFIX token

82a29fc

add support for pi and π as constant, not function

359de77

improved error reporting

d09223b

error reporting for mismatched parens

751a073

refactoring for single-char tokens

62f430b

parsing of fixed-value ranges & improvements

6f7ad19

PhilippImhof added 2 commits December 9, 2024 15:31

some cleanup in tests

fc87c64

more cleanup in tests

7bf6cfe

PhilippImhof force-pushed the parser branch from c93189e to 7bf6cfe Compare December 11, 2024 14:14

PhilippImhof added 2 commits December 11, 2024 16:23

make sure tests can be run invidiually

bd1dcb9

more tests, more cleanup

027100a

PhilippImhof force-pushed the parser branch from 3160f4d to 027100a Compare December 12, 2024 13:16

PhilippImhof added 13 commits December 12, 2024 16:18

test for variable/exponential precedence

0c0d68a

make test less prone to random failure

35d8a77

rename syntax checker function

867ad1f

add test for general/combined part feedback

aa41118

add test for parens

f580752

port test for number conversion

ee115c6

bugfix, more tests, cleanup

c69aa89

add invocation test for php builtin funcs

b64a8db

move autoloaded classes to official path

dd7c098

add phpdoc, externalise strings

38e1dc8

clean up language strings

2fbd8f8

cleanup

6b54b00

type hints, further testing, some cleanup

7c69580

PhilippImhof force-pushed the parser branch from a360086 to 7c69580 Compare December 19, 2024 15:17

PhilippImhof added 2 commits December 22, 2024 10:17

finish porting old tests, cleanup

97c65db

correctly handle prefix in model and student answers

07e3f5a

PhilippImhof added 2 commits December 23, 2024 17:29

explicitly disallow prefix in algebraic model answer

15bdf00

improve language

86e4c95

PhilippImhof force-pushed the parser branch from 1be4b1a to 86e4c95 Compare December 23, 2024 20:40

PhilippImhof added 3 commits December 24, 2024 09:48

improve error reporting for nan/inf

68f5634

avoid precision problem in ncr function

422fe60

Merge branch 'master' into parser

15f457d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite the parser #62

Rewrite the parser #62

PhilippImhof commented Nov 30, 2022 •

edited

Loading

PhilippImhof commented Feb 26, 2023 •

edited

Loading

PhilippImhof commented Mar 26, 2023 •

edited

Loading

PhilippImhof commented Mar 29, 2023

PhilippImhof commented Apr 15, 2023 •

edited

Loading

PhilippImhof commented Apr 22, 2023 •

edited

Loading

PhilippImhof commented Jul 25, 2023

PhilippImhof commented Dec 23, 2024

PhilippImhof commented Dec 23, 2024

Rewrite the parser #62

Are you sure you want to change the base?

Rewrite the parser #62

Conversation

PhilippImhof commented Nov 30, 2022 • edited Loading

PhilippImhof commented Feb 26, 2023 • edited Loading

PhilippImhof commented Mar 26, 2023 • edited Loading

PhilippImhof commented Mar 29, 2023

PhilippImhof commented Apr 15, 2023 • edited Loading

PhilippImhof commented Apr 22, 2023 • edited Loading

PhilippImhof commented Jul 25, 2023

From now on, some tests will be failing

PhilippImhof commented Dec 23, 2024

PhilippImhof commented Dec 23, 2024

PhilippImhof commented Nov 30, 2022 •

edited

Loading

PhilippImhof commented Feb 26, 2023 •

edited

Loading

PhilippImhof commented Mar 26, 2023 •

edited

Loading

PhilippImhof commented Apr 15, 2023 •

edited

Loading

PhilippImhof commented Apr 22, 2023 •

edited

Loading