Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for JSON input files #114

Open
1 task done
arsLibera opened this issue Jan 5, 2025 · 6 comments
Open
1 task done

Add support for JSON input files #114

arsLibera opened this issue Jan 5, 2025 · 6 comments
Labels
enhancement New feature or request

Comments

@arsLibera
Copy link

Use Case

Define simulation input parameters.

Problem

Existing input format is difficult to read and not easily extensible

Solution

Provide a JSON input reader/writer to replace the existing custom format.

Alternatives considered

None

Additional context

@mrp089 @ktbolt

Untested/unrefined implementation:
master...arsLibera:svOneDSolver:jsonOptions

Example JSON:
out.json

Code of Conduct

  • I agree to follow this project's Code of Conduct and Contributing Guidelines
@arsLibera arsLibera added the enhancement New feature or request label Jan 5, 2025
@mrp089
Copy link
Member

mrp089 commented Jan 6, 2025

Thank you so much for doing this, @arsLibera! I like your restructuring of the checks and reading the input file! It's also nice that the old format is still supported. Those are my thoughts so far:

JSON format:

  1. outputType is currently TEXT or VTK  (I don’t think we test VTK, but I remember it working).
  2. The DATATABLES are horrible. Currently, everything is a data table with time in the first column and the values in the second column. This makes sense for a time series (like inflow) but not for boundary conditions. I think we could change that to the format the 0D solver is using for boundary conditions. Unfortunately, none of the DATATABLE options besides LIST are tested, so I think it’s okay not to include all those options. The only outlet boundary conditions tested are:
    • Resistance (0D, 1D). The distal pressure Pd can be optional and default to 0.
    • RCR (0D, 1D). The distal pressure Pd can be optional and default to 0.
    • Coronary (0D, 1D). 0D and 1D have a different format that gets converted here.

Testing:

  1. The GitHub workflows probably need to be updated since they haven’t run in a year.
  2. The current tests need to be converted to the new format.
  3. We need something that executes your unit test. In svMultiPhysics, we have a compiler flag that builds the unit tests.

Eventually, we also need to update the 1D input file generator in SimVascular. Changing the code shouldn’t be too complicated. However, as you can see from the pipeline performance, we currently can’t build SimVascular on certain platforms. @ktbolt might have an update on that.

Let me know how you would like to proceed, @arsLibera, and where we can help!

@ktbolt
Copy link
Contributor

ktbolt commented Jan 7, 2025

@arsLibera A very good start!

We do need to support the old and new formats, will be some time until I can build SV again.

Selecting a new file format we need to think about

  • what applications will consumes the file
  • do users need to manually edit the file
  • does C++ have native support for the file format
  • how easy is it to check for a valid file

Given that the 1D solver data is flat why not just use a simple text format that is cleaner and clearer than the current one ?

nodes
number of nodes 5
-1.9639334678649902 -1.6342906951904297 12.921173095703125 
...

joints
number of joints 3
joint j1 
  inlet IN1
  outlet OUT1
  segments 0
...

segments
number of segments 4
segment Aorta
  length 17.670671
  number of elements 5
  

@arsLibera
Copy link
Author

arsLibera commented Jan 8, 2025

@ktbolt

We do need to support the old and new formats, will be some time until I can build SV again.

Makes sense

what applications will consumes the file

I wasn't aware of other applications besides the 1D solver that will be consuming the data. Can you point me to the other clients?

do users need to manually edit the file

I agree that this is necessary, but something we should strive to avoid. In particular, the formatting in the file should be taken care of by the software. Small edits could be made by the user, but major edits should be easily supportable either through the C++ application or through a helper script (e.g., in Python).

does C++ have native support for the file format

This is one of the reasons why JSON is a good option. If you mean native in the sense of C++ itself or STL, no, but there are tons of great open source libraries out there with permissive licenses (as I'm sure you're aware).

how easy is it to check for a valid file

I think there's two subpoints here: how easy it is to verify the formatting in the file and how easy it is to verify that the data is valid for the application.

It might also be worth considering:

  1. Is the format acceptable across languages? What if we want to create data in python or write a front-end web application that interacts with the data (for example, for editing) -- will the format be acceptable?
  2. Does the format follow a well-understood standard that can be used to help make the format easily understood by multiple groups?

But, from a software perspective I think the question for me is always:
3. How easy will it be to update the structure of the data (and, consequently, the parser) this when requirements change?

Here's how I would evaluate your suggestion of a plain text format:

what applications will consumes the file

Most applications would prefer to consume JSON

do users need to manually edit the file

I do not personally find manually editing (pretty) JSON to be more difficult than manually editing a plain text file in the sense that I do not find it more prone to errors. But for non-programmers, a custom plain-text format (like the one you suggest) could certainly be easier to read and edit.

does C++ have native support for the file format

This is probably the biggest reason not to use a custom plain-text format. There's no existing support.

how easy is it to check for a valid file

a. In terms of validating the format, json is obvious here. We can rely on the existing parsers to throw errors for us if there is bad formatting. The formatting of a custom-formatted plain text file will take a lot more work.
b. In terms of validating data to be consumed, there's less benefit from JSON, but I still think we get a small advantage there because there is a standard that differentiates how arrays, text, and numerics are formatted in JSON.

If we look at the additional points I raised:

  1. JSON is widely used across languages. We'll get good native support for it form Python to JavaScript.
  2. I think using a JSON format is much better for wider intelligibility. It provides a framework for reading the data. When I read your proposed format, I can get the point, but I have to basically build a new parser in my mind. It's a little like re-inventing the JSON (or other standard format) wheel.
  3. It's pretty hands-down that updating the structure of the data and the format is going to be easier with JSON due to existing parsers and file writers that we can use across languages. There are other data file formats we could consider -- but I've personally not encountered one with half the support that JSON has across languages.

Brian

@arsLibera
Copy link
Author

arsLibera commented Jan 8, 2025

@mrp089

The DATATABLES are horrible.

Sure, I wondered about those data tables (and a couple other things). I'd propose that we keep the format as-is initially, and then start to migrate data format (and clients of the data) once we can drop support for the previous format.

The GitHub workflows probably need to be updated since they haven’t run in a year.
The current tests need to be converted to the new format.
We need something that executes your unit test. In svMultiPhysics, we have a compiler flag that builds the unit tests.

Based on that perspective I'd propose this initial step:

  1. Update the unit tests to run whatever you're currently doing in svmultiphysics (Can you maybe elaborate on that a little or send me a link to a doc page if you have it)?
  2. Add unit tests for the legacy file format to lock down that functionality
  3. Once I get that sorted we can submit that and verify that the tests are running

After that I'd propose two changes:

  1. A refactoring change to decouple the legacy file format from the options structure and main file
  2. Addition of the JSON serializer/deserializer

Brian

@mrp089
Copy link
Member

mrp089 commented Jan 8, 2025

I think JSON makes the most sense. It's not much more complicated than the plain text that @ktbolt suggests, and it's a standard format. We mostly use these reduced order models (0D and 1D) in applications where we sample many model evaluations, e.g., for uncertainty quantification, sensitivity analysis, or optimization. Using JSON, parameters can be changed easily in Python, and the simulations can be easily connected to other frameworks.

For svMultiPhysics, we threw all unit tests in this folder. We added those lines to the CMakeLists.txt. These are activated during the build with a compiler flag and executed (manually or by GitHub Actions) with ctest.

@ktbolt
Copy link
Contributor

ktbolt commented Jan 8, 2025

@arsLibera @mrp089 Good comments!

So let's go with JSON then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants