Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON-Schema - plan for implementation and use #97

Open
gunnar-mb opened this issue Jan 16, 2024 · 0 comments
Open

JSON-Schema - plan for implementation and use #97

gunnar-mb opened this issue Jan 16, 2024 · 0 comments
Labels

Comments

@gunnar-mb
Copy link
Collaborator

0. JSON-schema!? - don't we prefer YAML?

Yes, IFEX Project basically prefers YAML, but JSON and YAML can also be trivially translated to eachother.

JSON-schema tools are more widespread and supported than various YAML-schema approaches, and many JSON-schema tools directly support validating YAML because of the (very close to) 1-1 equivalence between JSON and YAML. Therefore, JSON-schema is chosen over YAML-schema. (For more, see why JSON-schema is useful).


1. Why JSON-schema is not the source of truth for IFEX syntax

Since the first implementations we have strongly tried to follow the principle of a single source of definition, to define the IFEX "model"/"metamodel" and consequently the IFEX Core IDL (which is like a YAML "printout" of the internal model),

Currently, the Dataclass definitions in the ifex_ast.py implementation is the official source of definition. The YAML format is simply a consequence of this model - the YAML directly mirrors the tree structure that is described there, in Python Source code.

Another approach some projects take is to define a "syntax" using JSON-schema. This was not chosen since the beginning because:

  1. It seemed an intermediate "unnecessary" step to read JSON-schema into a Python program, and then proceed to very input based on it, when instead this could arguably more easily be defined directly in Python source code.

  2. A corollary to this is that JSON-schema is a relatively complex and verbose description format because it needs to overcome the extra level of using JSON itself to describe another JSON document. Defining an internal meta-model using the classes or datastructures in a programming language is more expressive and less verbose.

Source code (e.g. python):

class MyThing:
    fred_and_wilma: stringtype[typing.Optional]
    ...next field ...

JSON-schema is more verbose. It essentially needs to define each member item using "name" instead of just writing the name directly.

JSON-schema (in principle, not exact syntax)

item {
   name: MyThing
   fields: [
     {
       name: fred_and_wilma
       type: string
     },
     { ... next field ... }
}

and even "optional" is defined in a separate section specifying what is mandatory fields, and another field clarifying if the


2. Why JSON-schema is useful

Considering the above, a JSON schema is for now still not considered to be the source-of-truth. However, it is useful because it is well supported by tools. We can use this to:

A) Potentially use JSON-schema validation libraries to check IFEX input and get better error messages (TBD)
B) IDE:s such as VSCode can use the schema directly (with a plugin). If VSCode + plugin is given a JSON schema the editor will automatically do syntax-checking and completion in real-time, when editing IFEX files.


3. The plan:

The Python source code (in ifex_ast.py) will for now remain the official definition of the (meta)model and the IFEX Core IDL YAML format. Therefore, the only way to ensure "single source" is to flip things around and generate the JSON schema from the internal model.

=> A program will be written to read the internal class definitions and output a corresponding JSON-schema file. This is similar to how parts of the specification is generated also from the source code, and it leads to the same effect that when we make a change to the definition, the documentation and JSON-schema can be trivially regenerated to always be in sync.


4. Future

The JSON-schema potentially plays another role in the future. We might switch around what is the official source of definition, but right now it feels unlikely. We already noted the possibility to do input-validation, but another possibility is for tools that are to be implemented in other programming languages, perhaps Rust or something else that is powerful and popular. Clearly the definition in Python source code is not directly applicable there, only something generated from it would be. If JSON-schema is supported in the programming environment that is chosen, the those implementations might choose to use the JSON-schema somehow as the language definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant