Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADIOS2 schema 2022_07_26, based on ADIOS2 modifiable attributes #1310

Merged
merged 14 commits into from
Aug 8, 2023

Conversation

franzpoeschel
Copy link
Contributor

@franzpoeschel franzpoeschel commented Aug 16, 2022

Based on (and to a large part compatible with) the old ADIOS2 schema 0, this PR removes schema 2021, using instead the allowModification parameter of the DefineAttribute() call in ADIOS2 to bring a better support for ADIOS2 steps (which was the main motivation for schema 2021).

Problem of schema 0: It was impossible to associate groups to single steps, making it unusable for variable-based iteration encoding. In schema 0, the group hierarchy was restored at read time indirectly by inquiring attributes and variables. Since attributes cannot be deleted in ADIOS2, this makes it impossible to delete a group once defined.

The new schema (2022) introduces a meta table for tracking active groups in the hierarchy, see an example dataset created by the variableBasedSeries test:

Step 0:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {1}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 0
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 0
  uint64_t  __openPMD_groups/data/meshes               attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

Step 1:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  double    /data/meshes/E/1/position                  attr   = 0
  uint64_t  /data/meshes/E/1/shape                     attr   = 1
  double    /data/meshes/E/1/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/1/value                     attr   = 1
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  uint64_t  /data/meshes/E/attr_1                      attr   = 1
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {2, 2}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 1
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 1
  uint64_t  __openPMD_groups/data/meshes               attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/1           attr   = 1
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

Step 2:
  string    /basePath                                  attr   = "/data/%T/"
  double    /data/dt                                   attr   = 1
  double    /data/meshes/E/0/position                  attr   = 0
  uint64_t  /data/meshes/E/0/shape                     attr   = 1
  double    /data/meshes/E/0/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/0/value                     attr   = 0
  double    /data/meshes/E/1/position                  attr   = 0
  uint64_t  /data/meshes/E/1/shape                     attr   = 1
  double    /data/meshes/E/1/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/1/value                     attr   = 1
  double    /data/meshes/E/2/position                  attr   = 0
  uint64_t  /data/meshes/E/2/shape                     attr   = 1
  double    /data/meshes/E/2/unitSI                    attr   = 1
  uint64_t  /data/meshes/E/2/value                     attr   = 2
  uint64_t  /data/meshes/E/attr_0                      attr   = 0
  uint64_t  /data/meshes/E/attr_1                      attr   = 1
  uint64_t  /data/meshes/E/attr_2                      attr   = 2
  string    /data/meshes/E/axisLabels                  attr   = {"x"}
  string    /data/meshes/E/dataOrder                   attr   = "C"
  string    /data/meshes/E/geometry                    attr   = "cartesian"
  double    /data/meshes/E/gridGlobalOffset            attr   = 0
  double    /data/meshes/E/gridSpacing                 attr   = 1
  double    /data/meshes/E/gridUnitSI                  attr   = 1
  float     /data/meshes/E/timeOffset                  attr   = 0
  double    /data/meshes/E/unitDimension               attr   = {0, 0, 0, 0, 0, 0, 0}
  int32_t   /data/meshes/E/x                           {1000}
  double    /data/meshes/E/x/position                  attr   = 0
  double    /data/meshes/E/x/unitSI                    attr   = 1
  int32_t   /data/meshes/E/y                           {3, 3, 3}
  double    /data/meshes/E/y/position                  attr   = 0
  double    /data/meshes/E/y/unitSI                    attr   = 1
  uint64_t  /data/snapshot                             attr   = 2
  double    /data/time                                 attr   = 0
  double    /data/timeUnitSI                           attr   = 1
  string    /date                                      attr   = "2022-08-17 14:59:15 +0000"
  string    /iterationEncoding                         attr   = "variableBased"
  string    /iterationFormat                           attr   = "/data"
  string    /meshesPath                                attr   = "meshes/"
  string    /openPMD                                   attr   = "1.1.0"
  uint32_t  /openPMDextension                          attr   = 0
  string    /software                                  attr   = "openPMD-api"
  string    /softwareVersion                           attr   = "0.15.0-dev"
  uint64_t  __openPMD_groups/data                      attr   = 2
  uint64_t  __openPMD_groups/data/meshes               attr   = 2
  uint64_t  __openPMD_groups/data/meshes/E             attr   = 2
  uint64_t  __openPMD_groups/data/meshes/E/0           attr   = 0
  uint64_t  __openPMD_groups/data/meshes/E/1           attr   = 1
  uint64_t  __openPMD_groups/data/meshes/E/2           attr   = 2
  uint64_t  __openPMD_internal/openPMD2_adios2_schema  attr   = 20220726
  uint8_t   __openPMD_internal/useSteps                attr   = 1

For each IO step i and active path <p>, the value of the modifable attribute __openPMD_groups/<p> is then set as i in that step.
(Note that the simpler alternative of using a boolean "active" flag openPMD_group_is_active/<p> = true or similar does not work in parallel contexts. Using the step index has the advantage that the flag needs not be re-set.)
The LIST_PATHS IO task can then be implemented by using only the group table. A path exists if:

  1. Its entry in the meta table exists
  2. EITHER the file being read does not use ADIOS2 steps OR its attribute value is equivalent with the current step index

Using a table in this form allows for an algorithmically quick lookup via prefix search in a sorted map, and also visually declutters the metadata by putting the table in one block as seen in the above output of bpls -alt.

Such tricks are necessary only for paths, not for datasets, since ADIOS2 variables (i.e. datasets) are clearly associated with IO steps.

Drawback that we accept with this design: Unlike groups, an attribute once written can only be modified (not yet implemented), but not deleted. We need to expose the allowModification tag somehow though to enable mutable user-defined attributes (see TODOs).

TODO

  • Merge New Access type READ_LINEAR #1291 first
  • Make metadata (attributes) modifiable? Ideas: JSON parameter "metadata_changes", always make constant record components modifiable, attributes mutable by default in variable-based iteration encoding
  • code cleanup: remove old* names, introduce if constexpr
  • The only really breaking change compared to the old schema is the renaming of __is_boolean__ to __openPMD_internal/is_boolean. Apart from this, the 2022 schema is forward- and backward-compatible to the old one. So, we should discuss if we want to stick with the renaming, or just keep the old name.
  • Parsing: Don't use group table, even if present, when specifying adios2.use_group_table = false.

@franzpoeschel franzpoeschel marked this pull request as draft August 16, 2022 10:10
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 4 times, most recently from 7044d3d to d089c1d Compare August 18, 2022 14:08
@franzpoeschel franzpoeschel marked this pull request as ready for review August 22, 2022 08:57
@franzpoeschel franzpoeschel marked this pull request as draft August 22, 2022 11:36
@franzpoeschel franzpoeschel marked this pull request as ready for review August 22, 2022 14:14
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 6 times, most recently from 70b0356 to 4ae06b9 Compare August 26, 2022 10:11
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 2 times, most recently from 07af1b6 to 22fb4dc Compare August 31, 2022 12:59
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 2 times, most recently from bd748ad to 220c203 Compare November 9, 2022 13:26
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 5 times, most recently from 9d7bb3a to ab89d27 Compare November 15, 2022 12:40
src/ParticleSpecies.cpp Fixed Show fixed Hide fixed
test/ParallelIOTest.cpp Fixed Show fixed Hide fixed
test/SerialIOTest.cpp Fixed Show fixed Hide fixed
@@ -1155,6 +996,10 @@
*/
void invalidateVariablesMap();

void markActive(Writable *);

// bool isActive(std::string const & path);

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@franzpoeschel franzpoeschel requested a review from ax3l May 5, 2023 10:57
@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 2 times, most recently from 9d06458 to 07dbb48 Compare May 11, 2023 09:19
@ax3l ax3l self-assigned this Jun 17, 2023
@ax3l ax3l added this to the 0.16.0 milestone Jun 17, 2023
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

I added small inline comments for suggestions.

There is a small conflict in docs that needs to be resolved now, from merge of other PRs.

Once this leaves experimental state in future releases, we should remember to document this in FORMAT_ADIOS.md of openPMD-standard: https://github.com/openPMD/openPMD-standard/blob/upcoming-2.0.0/FORMAT_ADIOS.md

docs/source/backends/adios2.rst Outdated Show resolved Hide resolved
Comment on lines 39 to 48
/*
* ADIOS2 v2.8 brings mode::ReadRandomAccess
*/
#define HAS_ADIOS_2_8 (ADIOS2_VERSION_MAJOR * 100 + ADIOS2_VERSION_MINOR >= 208)
/*
* ADIOS2 v2.9 brings modifiable attributes (technically already in v2.8, but
* there are too many bugs, so we only support it beginning with v2.9).
* Group table feature requires ADIOS2 v2.9.
*/
#define HAS_ADIOS_2_9 (ADIOS2_VERSION_MAJOR * 100 + ADIOS2_VERSION_MINOR >= 209)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should be runtime functions instead?

Anyway, the way you use them here macros make sense.
One thing to consider: this is a header file. You might want to undefine the two macros at the end of it, especially with the generic name.

Since you use the macros around a few locations (e.g., also in tests), maybe provide them in a helper header file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should be runtime functions instead?

We could maybe add something like auto Series::backendInfo() const -> std::map<std::string, std::string> analogous to auto Series::backendName() const -> std::string for exposing this information to users? Would be outside of the scope of this PR though.

One thing to consider: this is a header file. You might want to undefine the two macros at the end of it, especially with the generic name.

The macros are needed in the including *.cpps. Since these headers are internal and not exposed to user code, this is probably not so critical, but I will rename them to OPENPMD_HAS_ADIOS....

Since you use the macros around a few locations (e.g., also in tests), maybe provide them in a helper header file?

That's why I moved them to ADIOS2Auxiliary.hpp in the first place :D But given that I will be renaming the macros, creating their own file is fine, too.

@franzpoeschel franzpoeschel force-pushed the topic-third-adios2-schema branch 3 times, most recently from 07a32cd to d84898f Compare June 22, 2023 11:53
Copy link
Member

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 ✨

@ax3l ax3l merged commit 58028ea into openPMD:dev Aug 8, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants