Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JSON for parallel I/O #1472

Closed
eschnett opened this issue Jul 3, 2023 · 6 comments · Fixed by #1475
Closed

Support JSON for parallel I/O #1472

eschnett opened this issue Jul 3, 2023 · 6 comments · Fixed by #1475
Assignees

Comments

@eschnett
Copy link
Contributor

eschnett commented Jul 3, 2023

It seems that parallel (MPI) I/O does not support the JSON format. I'd like to use that format for debugging.

The function createIOHandler, when called with an MPI communicator, does not accept the JSON format. This is with the current dev branch.

@franzpoeschel
Copy link
Contributor

Are you asking for JSON support when running with a single MPI rank? This could be added relatively easily. The backend would then simply fail when detecting that the MPI size is greater than 1.

There is on purpose currently no support for MPI-based JSON, since it is unclear what that should do and the "good" options would require an implementation effort orders of magnitudes larger than the merit. Relatively easy options:

  • All output from ranks other than 0 could simply be ignored, this would ideally be opt-in only, sth like Series("data.json", Access::CREATE, MPI_COMM_WORLD, R"({"json": {"i am aware": "that there will only be output from rank 0"}})" to avoid confusion that other output is missing
  • Maybe just write one JSON file per rank? We could even provide some post-processing tooling to merge the files, the pieces are nearly there already in our internal JSON helpers.

@eschnett
Copy link
Contributor Author

eschnett commented Jul 4, 2023

I am looking for test output, i.e. I want to produce a small amount of output to test the I/O mechanism in the Einstein Toolkit. You could assume a small (but non-trivial) number of MPI processes and a small amount of data to be written.

If this only works for a single MPI rank then this would be less useful. In this case I'm probably mostly looking for a better error message.

@franzpoeschel
Copy link
Contributor

Writing from multiple ranks to a single JSON file would imply having to implement a parallel data aggregation mechanism from scratch just for JSON. The openPMD-api does so far not do any metadata nor data aggregation, but defers this task to HDF5 and ADIOS.
For testing purposes however, I'd say that having each rank write to its own JSON file might even be helpful to see what each rank is doing? Sth like:

test_data.json/data_0.json
               data_1.json
               data_2.json
               ...
               data_n.json

@eschnett
Copy link
Contributor Author

eschnett commented Jul 4, 2023

Yes, this would be useful.

@franzpoeschel
Copy link
Contributor

Ok, I'll try to find some time to add this.

Note: You might be interested in PR #1277, too. It combines several things which I will try merging as multiple PRs in the coming time:

  • Add a TOML backend alternatively to JSON. This is implemented by converting from/to TOML in the JSON backend at read/write time, so all features of the JSON backend should be available there, too.
  • Add a template mode to the JSON/TOML backend. Its purpose is to write only the metadata of n-dim arrays instead of the arrays. This should make it possible to use the backend even in larger simulations in order to output just the structure, if not the data.
  • Add a shortcut representation for attributes that does not explicitly encode the datatype --> easier to read

@franzpoeschel franzpoeschel self-assigned this Jul 6, 2023
@franzpoeschel franzpoeschel mentioned this issue Jul 10, 2023
4 tasks
@franzpoeschel
Copy link
Contributor

This PR implements the above suggestion. It's based on the TOML PR described above to avoid merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants