Introduce `FieldInfo` by mabruzzo · Pull Request #467 · cholla-hydro/cholla

mabruzzo · 2026-01-28T22:19:39Z

This is a WIP PR. I need to revisit this to make sure its coherent and tweak a few things

Overview

This PR introduces the FieldInfo type. The Grid3D type stores a FieldInfo instance called field_info.

This type does a few things:

it maps strings holding field names to the field's id if present (the field id is the value held by grid_enum)
it does the reverse mapping
it also does a few other things

To help illustrate the benefits of this type, I partially refactored the Output_Data function.

Motivation

There are 2 main motivations for this type:

More immediately, it simplifies reading/writing field data files. This logic needs to map data to field names.
- By using the logic implemented in FieldInfo,
  - we can reduce preprocessor statements in serialization logic
  - we can replace the common pattern (that comes up a lot) where code is repeated N times for each field-name/ptr pair with a loop over field names
  - it's worth emphasizing that this is very useful when it comes to (de)serialization of passively advected scalars.
    - For context, I don't think any IO code, outside of the HDF5 (de)serialization logic, works with individual scalar fields.
    - Prior to this PR, if we wanted say the Slice logic to write data out individual, we would have to enumerate all of the slice fields (and update this logic each time we added a new slice field). Now, it's relatively straight-forward to loop over slice fields.
- to concretely illustrate the benefits of FieldInfo,
  - this PR partially refactored the HDF5-logic in the Output_Data function.
  - As it turns out, the refactored version of OutputData isn't super compelling, alone. Consequently, I have also refactored Grid3D::Read_Grid_HDF5 (Refactor Read_Grid_HDF5 to make use of FieldInfo #468), and in the process of refactoring Output_Float32 (WIP: Refactor Output_float32 #469)
- This logic can also be used to help simplify the slice, projection, and rotated-projection logic.
Longer-term, if we use field_info.field_id(<name>) to infer the field ids of passive scalar, this will help us move toward an architecture where physics modules that work with passive scalars are always compiled and fully controlled at runtime,
- this becomes useful once we know that the GasEnergy field id is NEVER inferred from grid_enum::num_fields. At that point, we can make the passive scalars always have the highest field ids and the number of passive scalars could be configured at runtime.
- Why is this useful? For concreteness, let's consider the Dust module (this also applies to chemistry).
  - If the number of passive scalars is settable at runtime and we eliminate grid_enum::dust_density from the code base, then the Dust module will be configurable at runtime.
  - How do we eliminate grid_enum::dust_density. As an example, consider Dust_Kernel. We could modify that function to accept an extra argument called dust_density_index, which would be determined on the host using field_info.field_id("dust_density"). Then, the occurrence of grid_enum::dust_density would be replaced by dust_density_index.

More about `FieldInfo`

As noted above, the FieldInfo type primarily maps strings holding field names to the field's id if present and vise-versa.

The type also tracks whether a field should be read from the host field-data buffer or the device field-data buffer during serialization.¹

Additionally, the FieldInfo type makes use of the idea of field-kinds.
The kinds are defined by enumerators of the field::Kind enumeration.
The enumerators include (these are always present, regardless of how Cholla is compiled):

field::Kind::HYDRO (it includes GasEnergy when Cholla is configured to use the DualEnergy formalism)
field::Kind::PASSIVE_SCALAR
field::Kind::MAGNETIC
There are 2 related pieces of functionality:

FieldInfo can be used to loop over the field ids of all fields of a given field kind. The following snippet loops over the field ids of all field::Kind::HYDRO fields
```
  for (int field_id : field_info.get_id_range(field::Kind::HYDRO)) {
    // do-work ...
  }
```
FieldInfo lets you query the number of fields with a given field-kind. For example, the following snippet queries the number of hydro fields:
```
int n_hydro = field_info.n_fields(field::Kind::HYDRO);
```
(This feature is present mostly because it was trivial to implement).

The `FieldWriter` Type

A portion of this PR was dedicated to refactoring Write_Grid_HDF5 and the creation of the FieldWriter type in order to make use of FieldInfo.

At a high-level, FieldInfo tracks all customization for the general field data-output and is a callable that can be invoked to create the format.

It's worth mentioning

all customization is remains controlled by compile-time ifdef statements (i.e. OUTPUT_MOMENTUM, OUTPUT_ENERGY, OUTPUT_METALS, OUTPUT_ELECTRONS).
I didn't touch the code related to recording Temperature, Gravitational Potential, or magnetic fields (these are all special cases and somewhat beyond the scope of this PR)
the output customization ONLY impacts hdf5 files (i.e. NOT binary outputs and NOT ASCII outputs)
this is all consistent with the historical behavior.

With that said:

the code is written so that it will be trivial to support runtime-configuration. I choose not to convert the existing options because it's not obvious that they are the best choices (and it's not obvious that they are actually used). We might want a set of options that are more consistent with the float32 options.
it would be straight-forward to make the ASCII format support configuration, I just had some questions about that first (and I think the plan is to eventually eliminate the Binary Format)

Implementation of `FieldInfo`: `FrozenKeyIdxBiMap`

The FieldInfo type is implemented in terms of FrozenKeyIdxBiMap. That type is a custom map type that allows fast mapping from string keys to indices and vise-versa.

It's definitely faster than std::map or std::unordered_map for our purposes. For some context:

std::map is typically implemented as a Red-black tree. These operations have the worst average complexity.
std::unordered_map is typically implemented as a hash table that uses linked-list chaining for handling key collision.,
FrozenKeyIdxBiMap is implemented with a hash table that uses open-addressing with linear probing for key collisions.

For clarity, I'll briefly recap the distinction between hash maps that using chaining and linear probing. Let's consider the process of looking up a key. In both cases keys are organized by a storage array, and a key's hash-value always corresponds to an index of this array.

for chaining, each element of the storage array holds the head of a linked list specifying a "chain" of all known key-value pairs that where the key shares the same hash value. During lookup, we need to search for the key through the linked list.
for linear probing, the storage array directly stores key-value pairs. During lookup, the hash-value tells us the starting location for performing a linear search through the storage array.

For our purposes (i.e. a relatively small map where the contents easily fit in cache and there is no need for resizing/deletion), the FrozenKeyIdxBiMap has significantly better cache-locality than std::unordered_map. With that said, I haven't actually benchmarked.

It's also worth mentioning that:

std::unordered_map would probably require us to store a second copy of all keys if we wanted to make the reverse mapping efficient.
if we get rid of the internal usage of std::shared_ptr², it would be easy to make FrozenKeyIdxBiMap work on GPUs (we probably don't want to do that, but it's worth mentioning the option).

Aside: Originally, I intended to work directly with a FrozenKeyIdxBiMap, but I decided to make FieldInfo as this PR progressed.

Note

A case could be made for re-implementing FrozenKeyIdxBiMap in terms of std::map or std::unordered_map, in the name of simplicity. I'm open to doing that if that's your preference.

It's not entirely clear that this should be tracked as part of FieldInfo, but it is indeed a useful quantity to track. ↩
This is left over from when I implemented a similar data structure in Enzo-E. We don't really get much benefit from it. ↩

evaneschneider · 2026-02-05T16:16:27Z

src/io/io.cpp

-  #endif  // SCALAR
+  #endif

  // 3D case


I'm a little confused about why this conditional now only exists for the 3D case - is it because MHD writes only exist for that case?

I'm not entirely sure I understand what you are talking about. Can you be a little more specific?

I just checked back, and I don't think anything has changed.

Both before the change and after this PR:

hydro fields (density, momentum_[xyz], Energy, & GasEnergy) and passive scalars are written for 1D, 2D, & 3D sims

the temperature field can be written for 1D, 2D, and 3D sims

the gravitational potential can only be written for 3D sims

magnetic fields can only be written for 3D sims (previously, the logic conditionally enabled by #ifdef MHD was definitely nested within if (H.nx > 1 && H.ny > 1 && H.nz > 1) {

As for why magnetic fields are only written 3D cases: your guess is probably correct (but only Bob would know for sure)

evaneschneider · 2026-02-05T16:23:12Z

src/io/FieldWriter.h

+
+  /*! A callable method that writes a rotated projection of the grid data to file.
+   */
+  void operator()(Grid3D &G, Parameters P, int nfile, const FnameTemplate &fname_template) const;


I'm confused by the choice of 'operator' for the name of this function, if it is just for writing rotated projections.

A class member function named operator() overloads the "function call operator."

Suppose we have an instance of the type (FieldWriter) called my_field_writer. We would invoke this method by calling my_field_writer(G, P, nfile, fname_template)

So, in a sense, instances of FieldWriter act just like configurable functions

For added context, you would implement the special __call__ method to get analogous behavior for a python class.

We could name it something else. But the most scalable way to do that involves defining a base-class with a virtual method and each kind of output would implement a subclass that overwrites the virtual method.

mabruzzo added 8 commits January 20, 2026 10:51

A bunch of the initial changes

fd6a529

some more incremental progress

95ac213

introduce FieldInfo

d0bb96b

eliminate unused output_ionization variable

4c0c701

a bunch more progress

d1952cb

remove a few unused commented out lines

0907214

minor typo fix

98e89c5

Add a clarifying comment

2f526b9

mabruzzo mentioned this pull request Jan 28, 2026

Refactor Read_Grid_HDF5 to make use of FieldInfo #468

Open

mabruzzo force-pushed the FieldNameMap branch from eccbc22 to 2f526b9 Compare January 28, 2026 22:58

mabruzzo added 4 commits February 3, 2026 09:08

rename field::Kind::{SCALAR->PASSIVE_SCALAR}

abe8892

convert Output_Data into the callable method of FieldWriter

d7aa63e

light refactoring

f4fedb5

slightly adjust handling of Magnetic fields

ebad964

mabruzzo mentioned this pull request Feb 3, 2026

WIP: Refactor Output_float32 #469

Open

evaneschneider reviewed Feb 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `FieldInfo`#467

Introduce `FieldInfo`#467
mabruzzo wants to merge 12 commits intocholla-hydro:devfrom
mabruzzo:FieldNameMap

mabruzzo commented Jan 28, 2026 •

edited

Loading

Uh oh!

evaneschneider Feb 5, 2026

Uh oh!

mabruzzo Feb 6, 2026

Uh oh!

evaneschneider Feb 5, 2026

Uh oh!

mabruzzo Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mabruzzo commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Motivation

More about FieldInfo

The FieldWriter Type

Implementation of FieldInfo: FrozenKeyIdxBiMap

Footnotes

Uh oh!

evaneschneider Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

mabruzzo Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

evaneschneider Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

mabruzzo Feb 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mabruzzo commented Jan 28, 2026 •

edited

Loading

More about `FieldInfo`

The `FieldWriter` Type

Implementation of `FieldInfo`: `FrozenKeyIdxBiMap`