Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Const Records: Relax shape for Particles #289

Open
wants to merge 1 commit into
base: upcoming-2.0.0
Choose a base branch
from

Conversation

ax3l
Copy link
Member

@ax3l ax3l commented Aug 5, 2024

Description

For MPI-parallel I/O output, we developed a new method in ADIOS2 that does not need an initial metadata gather ("JoinedArrays"). To be able to use this mode, we need to relax the requirements to write a shape for constant records in a species (particle group), because otherwise we still have to do a collective gather.

This adds the need for a slight additional read fallback implementation on the reader side.

Affected Components

  • base

Logic Changes

The required attribute shape in constant record components is now optional for records in particle groups (species), if there is at least another record to recover the shape from in the same particle species.

Writer Changes

The required attribute shape in constant record components is now optional for records in particle groups (species).

Reader Changes

The required attribute shape in constant record components is now optional for records in particle groups (species).
If the attribute is missing, go through other records and components of the same species and pick the first one that has a shape (e.g., non-constant record component full extent or a constant record component with a shape) and use that information to recover.

What would a reader need to change? Link implementation examples!

Data Converter

No changes needed. Files from 1.X will be forward compatible with regards to this change.

@ax3l ax3l added the major change non-backwards compatible change label Aug 5, 2024
@ax3l ax3l added this to the openPMD 2.X milestone Aug 5, 2024
@ax3l ax3l changed the title Const particle shape Const Records: Relax shape for Particles Aug 5, 2024
@ax3l ax3l changed the base branch from latest to upcoming-2.0.0 August 5, 2024 17:32
For advanced, highly parallel I/O output, we developed a new
method in ADIOS2 that does not need an initial metadata gather
("JoinedArrays"). To be able to use this mode, we need to relax
the requirements to write a shape for constant records in a species
(particle group), because otherwise we still have to do a collective
gather.

This adds the need for a slight additional read fallback implementation
on the reader side.
@franzpoeschel
Copy link

This has practical consequences for parsing a corner case of openPMD data, namely constant scalar particle records.
Take for example this constant scalar particle record:

  int32_t   /data/500/particles/e_all/mass/macroWeighted                        attr   = 0                                                          
  uint64_t  /data/500/particles/e_all/mass/shape                                attr   = {1829775}                                                  
  double    /data/500/particles/e_all/mass/timeOffset                           attr   = 0                                                          
  double    /data/500/particles/e_all/mass/unitDimension                        attr   = {0, 1, 0, 0, 0, 0, 0}                                      
  double    /data/500/particles/e_all/mass/unitSI                               attr   = 8.26366e-28                                                
  double    /data/500/particles/e_all/mass/value                                attr   = 0.00110234                                                 
  double    /data/500/particles/e_all/mass/weightingPower                       attr   = 1

How does a parser distinguish /data/500/particles/e_all/mass from e.g. /data/500/particles/e_all/momentum:

  float     /data/500/particles/e_all/momentum/x                          {1829775} = 0 / 0
  float     /data/500/particles/e_all/momentum/y                          {1829775} = 0 / 0
  float     /data/500/particles/e_all/momentum/z                          {1829775} = 0 / 0

Currently, the openPMD-api implements this corner case in ParticleSpecies.cpp by:

            auto value = std::find(att_begin, att_end, "value");
            auto shape = std::find(att_begin, att_end, "shape");
            if (value != att_end && shape != att_end)
            {
                RecordComponent &rc = r;
                IOHandler()->enqueue(IOTask(&rc, pOpen));
                IOHandler()->flush(internal::defaultFlushParams);
                rc.get().m_isConstant = true;
            }
            try
            {
                r.read();
            }

I.e., when parsing a group /data/500/particles/e_all/MYSTERY, it checks if the group contains the attributes value and shape. If yes, it's a constant scalar component, otherwise it's a normal record.
With this standard change, I see two options:

  1. We only check for the attribute value. MIght lead to problems with datasets that for some reason use an attribute named value.
    Follow-up question: For legacy (1.0) datasets, do we keep checking for both attributes? If yes, the relaxed standard definition cannot be applied to old data. If no, some old datasets might be broken.
  2. We check if the record contains subgroups or datasets and treat it as a constant component if not. Might lead to problems with partially written data, e.g. after a crash.

I suggest that we decide for one clearly defined scheme to detect constant scalar components and standardize that. 1. is easier to implement.

Also, do I understand it correctly that the relaxed constant component markup is not applicable to Mesh records?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major change non-backwards compatible change
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

2 participants