Skip to content

Shared Secondary Particle Bank#3863

Open
jtramm wants to merge 72 commits intoopenmc-dev:developfrom
jtramm:post_fix_shared_secondary
Open

Shared Secondary Particle Bank#3863
jtramm wants to merge 72 commits intoopenmc-dev:developfrom
jtramm:post_fix_shared_secondary

Conversation

@jtramm
Copy link
Contributor

@jtramm jtramm commented Mar 11, 2026

Description

This PR introduces a shared secondary bank mode, so as to provide better load balancing across MPI ranks and OpenMP threads when weight windows are in use and thus mitigate the "long history" problem with weight windows. The long history problem naturally arises as weight windows can cause certain rare yet high importance particles to split heavily. In practice, this can often mean many threads are sitting idle waiting for a single thread to finish a very long history (potentially having millions of secondary particles stemming from a single source particle).

When enabled, the shared secondary bank allows for MPI ranks and OpenMP threads to load balance secondary particles. To accomplish this, the transport loop is run following alternative logic.

Current Behavior (Non-Shared Secondary Bank)

Under normal operation with the traditional secondary bank, transport runs in a "depth first" mode, where each particle appends its secondaries to a bank that the particle itself owns. When that particle finishes, it then loads a secondary from its local bank and processes it, continuing this loop until all secondaries for that particle are finished before potentially sourcing another. This means that no coordination needs to happen between threads.

One could in theory create a greedy algorithm where threads that have run out of work can steal from the banks of other threads. This isn't too hard to implement, but has the downside of breaking reproducibility as the stolen particles no longer are processed in the original PRNG stream (which is particle-owned).

New Mode: Shared Secondary Bank

The shared secondary bank is enabled automatically whenever weight windows are in use or when settings.shared_secondary_bank is set. When enabled, an alternative "breadth first" transport loop is used. In spirit, this mode operates a very similar manner to eigenvalue mode, where fission particles are banked and simulated in the following batch. In this mode, the transport loop is broken into a series of "secondary generations" that execute sequentially within a normal OpenMC batch. Any secondary particles produced in that secondary generation are banked and then simulated in the following secondary generation. This strategy is already in use by the Celeritas MC code.

Reproducibility is provided by essentially using the same strategy we use now for the fission bank, which is taken from:

F. B. Brown and T. M. Sutton. “Reproducibility and Monte Carlo Eigenvalue Calculations.” Transactions of the American Nuclear Society, volume 65, p. 235 (1990).

The Brown paper developed a fast sorting algorithm that is used to sort the fission bank, and we do the same here. We perform a sorting of the particles between each secondary generation by which particle produced them and what progeny number of their parent particle they were. In this manner, we get a consistent ordering of secondaries through the secondary generations, allowing for a consistent seeding scheme.

Once sorted, particles are load balanced between MPI ranks. Shared memory load balancing is not a problem in this mode, as there is no "long history" problem within a secondary generation. While particles may produce different numbers of secondaries still, the secondaries aren't simulated until the next secondary generation. As such, all tasks are (relatively speaking) much more uniform in work cost, and the long history problem is eliminated. Typically, long histories are not cause by secondary chains that are a million generations deep -- they are caused by single particles wanting to split a million times over the course of just a few generations as it enters a rare high-importance pathway (e.g., a beam port through a bioshield with a detector on the other side).

The seeding scheme that is used for the shared secondary mode is fully reproducible. That said, as far as I can see, there is not a good way to have it match the same results as in the traditional non-shared mode. They are statistically similar of course, but will not produce bitwise the same answers. They are different PRNG handling schemes.

The shared secondary mode handles consistent particle seeding by assigning seeds based on how many "tracks" have been simulated so far in the overall simulation. This is done by simply incrementing a global counter at the start of each batch and secondary generation by the number of tracks run. Each "track" is simply a particle being simulated from birth -> death, exclusive of any secondaries. This is different to what we have historically dubbed a "particle" or "particle history" in OpenMC, which is a simulated particle birth -> death inclusive of all secondaries. Thus, # tracks >= # of particle histories.

Changes to stdout

When the shared secondary bank is enabled we now print more info to stdout that allows for the user to see the secondary generations progressing and to see if aggressive weight window splitting is causing huge numbers of secondaries to be born. Previously, a user just observed potentially extremely long run times and low particle/sec tracking rates when heavy splitting was underway, but didn't have an idea if this was caused by code inefficiency or large numbers of secondary particles. The new output for a weight window simulation with 500 particles/batch looks like:

 ===============>     FIXED SOURCE TRANSPORT SIMULATION     <===============

 Simulating batch 1
  Primogenitor            particles: 500
  Secondary generation 1     tracks: 1804
  Secondary generation 2     tracks: 21219
  Secondary generation 3     tracks: 11209
  Secondary generation 4     tracks: 9999
  Secondary generation 5     tracks: 3367
  Secondary generation 6     tracks: 1015
  Secondary generation 7     tracks: 239
  Secondary generation 8     tracks: 34
  Secondary generation 9     tracks: 4
  Secondary generation 10    tracks: 0
 Simulating batch 2
  Primogenitor            particles: 500
  Secondary generation 1     tracks: 1744
  Secondary generation 2     tracks: 21364
  Secondary generation 3     tracks: 10399
  Secondary generation 4     tracks: 10408
  Secondary generation 5     tracks: 3183
  Secondary generation 6     tracks: 1503
  Secondary generation 7     tracks: 923
  Secondary generation 8     tracks: 411
  Secondary generation 9     tracks: 450
  Secondary generation 10    tracks: 122
  Secondary generation 11    tracks: 58
  Secondary generation 12    tracks: 1
  Secondary generation 13    tracks: 4
  Secondary generation 14    tracks: 0

Additionally, we also give another piece of timing data when the shared secondary bank is enabled: the number of "tracks/second". This allows one to better gauge performance of the code when there is significant splitting. This value should be much more stable across changes to weight window parameters compared to the particles/sec metric.

 =======================>     TIMING STATISTICS     <=======================
...
 Calculation Rate (active)         = 10153 particles/second
 Track Rate (active)               = 1.01997e+06 tracks/second

Implementation and Preservation of the Non-Shared Secondary Mode

The new shared secondary mode does not replace the original particle-owned secondary mode. As such, when not using weight windows users will not observe any changes in their existing results. The downside is that we have to maintain two different secondary handling pathways. The upsides though are:

  • The existing particle-owned bank is actually still used to store particles temporarily in the shared mode, meaning that logic didn't really need to change in the physics areas of the code. Secondaries are generated and stored the same way in both cases. The main difference is that these local banks are combined into a larger shared bank at the end of each particle's lifetime. This allows for a lot of logical processes (like particle production filter) to work as normal without modification.

  • In some cases where splitting is very uniform between particle histories it may be more optimal to just run without the shared bank mode. The shared bank does ensure optimal load balancing though at the cost of more buffering and some sorting operations. I have observed a few use cases where the original mode is faster, though often this is by a fairly small margin. Comparatively, on cases where long histories are present, the shared bank can offer a factor of speedup that asymptotically approaches the total number of processors in the simulation (e.g., 100x speedup on a single node is possible, or millions of times faster for large distributed jobs).

Performance Evaluation

I tested the JET model using random ray generated weight windows. I did a single node CPU test case (8 MPI ranks x 24 core (48 thread) per rank, for 192 cores (384 threads) total) 304 particles/sec with the traditional mode and 2,376 particles/sec with the shared secondary mode. This is a 7.8x speedup, and the weight window parameters performed very well for the problem and resulted in good uncertainties. On problems like JETSON 2D with some parameters I have observed up to a 53x speedup, though for other sets of weight windows on that problem I have also observed a ~2x slow down.

Generally it seems like the weight windows are a life saver in a lot of cases and should allow for performance to be maintained across a much wider variety of weight window parameters, as the long history problem is greatly mitigated. This should reduce the need for weight window tuning significantly and provide better out-of-the-box weight window effectiveness. In some cases it may be worthwhile to disable the shared mode to see if better performance isn't possible, but I would recommend you do most of your weight window tuning with the default shared secondary mode on as they are much more flexible and performance doesn't drop off a cliff in the same manner that it does without the shared bank.

Testing

I added a number of additional tests. In some cases I simply added a @parametrize to test out the new mode when it made sense. I originally only tested it for weight window use cases, but I discovered there were some pretty significant potential error modes with the new logic and as such developed a few different tests to cover those cases (e.g., what happens when fission neutrons are added to the shared bank).

The pulse height tests do have two different cases for shared/regular secondaries, even though the shared mode is disabled. I thought I'd leave these tests in so they will be ready to go when that PR comes along.

Limitations & Future Work

One cop out is that I disable the shared secondary mode (and give a warning) when pule height tallies are in use. These add some complexity that doesn't naturally map to the shared secondary mode context. However, there is a path forward here, but I'll leave that capability for another PR. I've added a TODO in the code at the sentinel point. For now, people using the pulse height tallies will simply be getting the performance they currently get in the code right now, so no harm done at least.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

jtramm and others added 30 commits December 4, 2025 09:39
…hysics particles as fission progeny, causing failure of scan-sort as there were missing fission progeny)
Resolved conflicts in 5 files:
- include/openmc/particle_data.h: kept local_secondary_bank rename,
  restored n_secondaries/secondary_bank_index accessors needed by
  new ParticleProductionFilter from develop
- src/eigenvalue.cpp: trivial formatting conflicts
- src/particle.cpp: kept refactored event_revive_from_secondary(SourceSite&)
- src/physics.cpp, src/physics_mg.cpp: kept local_secondary_bank usage,
  adapted new secondary_bank_index tracking from develop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove debug validation functions (debug_validate_local_bank_ordering,
  debug_validate_global_bank_ordering) and their call sites
- Remove TODO-marked debug sanity check in sort_bank()
- Remove commented-out sort and parent_id debugging code
- Remove unused global_secondary_bank variable
- Remove commented-out current_work assignment in from_source()
- Fix simulation_particles_completed not being reset between batches
- Add event-based mode guard for shared secondary bank
- Update weightwindows and particle_restart_fixed test baselines
- Remove accidentally tracked helper scripts and test artifacts from merge

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
John Tramm and others added 13 commits March 4, 2026 04:43
Bug 11: The inner loop variable 'site' in Phase 2 of
transport_history_based_shared_secondary shadowed the outer 'site'
reference, which would trigger -Wshadow warnings. Rename inner
loop variables to 'secondary_site'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bug 12: n_split was set from site.n_split after from_source(&site)
already copied it (from_source line 187: n_split() = src->n_split).
Remove the redundant assignment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Covers the interaction between weight window splitting, photon transport,
pulse-height tallies, and shared secondary bank mode. This combination
exercises the Bug 8 fix (pulse-height energy subtraction in
create_secondary) alongside weight window splits (which correctly skip
the subtraction since split particles are clones, not new secondaries).

Parametrized with local/shared subdirectories.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- S1: Add {0} default to current_work_ to prevent uninitialized reads
- S3: Add {0} defaults to SourceSite::parent_id and progeny_id
- S4: Guard n_tracks()++ in event_revive_from_secondary for shared mode
  since the counter is never consumed in shared secondary transport

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ondary

The survival_biasing test was silently running in shared secondary mode
due to auto-enable when weight windows are active. Parametrize with
explicit shared_secondary_bank setting to test both paths, restoring
the original pre-shared-secondary baseline for the local variant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a “shared secondary bank” transport mode aimed at making fixed-source weight-window runs more reproducible and scalable (sorting and MPI load-balancing secondary tracks), along with new/updated regression coverage for shared-vs-local behavior.

Changes:

  • Add shared-secondary-bank transport paths for both history-based and event-based kernels, including new sorting/load-balancing logic for secondary generations.
  • Extend SourceSite with additional metadata (e.g., born weights / split count) and plumb through MPI + Python bindings.
  • Update/add regression tests to run in both “local” and “shared” modes and add new shared-secondary-focused test cases.

Reviewed changes

Copilot reviewed 56 out of 62 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/regression_tests/weightwindows/test.py Parametrize WW regression to run in local/shared subdirs; make WW input file loading path-stable.
tests/regression_tests/weightwindows/survival_biasing/test.py Same local/shared parametrization + path-stable WW file loading; update source API usage.
tests/regression_tests/weightwindows/survival_biasing/shared/results_true.dat Add golden results for shared mode.
tests/regression_tests/weightwindows/survival_biasing/shared/inputs_true.dat Add golden inputs for shared mode (includes shared_secondary_bank).
tests/regression_tests/weightwindows/survival_biasing/local/results_true.dat Add golden results for local mode.
tests/regression_tests/weightwindows/survival_biasing/local/inputs_true.dat Add golden inputs for local mode.
tests/regression_tests/weightwindows/shared/results_true.dat Add golden results for shared mode.
tests/regression_tests/weightwindows/shared/inputs_true.dat Add golden inputs for shared mode (includes shared_secondary_bank).
tests/regression_tests/weightwindows/results_true.dat Remove single golden result in favor of per-subdir results.
tests/regression_tests/weightwindows/local/results_true.dat Add golden results for local mode.
tests/regression_tests/weightwindows/local/inputs_true.dat Add golden inputs for local mode.
tests/regression_tests/weightwindows_pulse_height/test.py New regression test combining WW + pulse-height tally under local/shared subdirs.
tests/regression_tests/weightwindows_pulse_height/shared/results_true.dat Golden results for shared subdir.
tests/regression_tests/weightwindows_pulse_height/shared/inputs_true.dat Golden inputs for shared subdir.
tests/regression_tests/weightwindows_pulse_height/local/results_true.dat Golden results for local subdir.
tests/regression_tests/weightwindows_pulse_height/local/inputs_true.dat Golden inputs for local subdir.
tests/regression_tests/weightwindows_pulse_height/init.py Package marker for new regression test directory.
tests/regression_tests/pulse_height/test.py Parametrize existing pulse-height regression for local/shared subdirs.
tests/regression_tests/pulse_height/shared/results_true.dat Golden results for shared subdir.
tests/regression_tests/pulse_height/shared/inputs_true.dat Golden inputs for shared subdir.
tests/regression_tests/pulse_height/local/results_true.dat Golden results for local subdir.
tests/regression_tests/pulse_height/local/inputs_true.dat Golden inputs for local subdir.
tests/regression_tests/particle_restart_fixed_shared_secondary/test.py New particle-restart regression targeting fixed-source + shared secondary bank.
tests/regression_tests/particle_restart_fixed_shared_secondary/settings.xml Settings enabling fixed-source + shared secondary bank.
tests/regression_tests/particle_restart_fixed_shared_secondary/results_true.dat Golden restart-output text for the new restart regression.
tests/regression_tests/particle_restart_fixed_shared_secondary/materials.xml Materials for new restart regression.
tests/regression_tests/particle_restart_fixed_shared_secondary/geometry.xml Geometry for new restart regression.
tests/regression_tests/particle_restart_fixed_shared_secondary/init.py Package marker for new restart regression directory.
tests/regression_tests/particle_production_fission/test.py New regression verifying ParticleProductionFilter behavior under local/shared modes.
tests/regression_tests/particle_production_fission/shared/results_true.dat Golden results for shared mode.
tests/regression_tests/particle_production_fission/shared/inputs_true.dat Golden inputs for shared mode.
tests/regression_tests/particle_production_fission/local/results_true.dat Golden results for local mode.
tests/regression_tests/particle_production_fission/local/inputs_true.dat Golden inputs for local mode.
tests/regression_tests/particle_production_fission/init.py Package marker for new particle-production regression directory.
tests/regression_tests/mg_fixed_source_ww_fission_shared_secondary/test.py New MG fixed-source WW regression with fission neutrons + shared secondary bank.
tests/regression_tests/mg_fixed_source_ww_fission_shared_secondary/results_true.dat Golden results for new MG shared-secondary regression.
tests/regression_tests/mg_fixed_source_ww_fission_shared_secondary/inputs_true.dat Golden inputs for new MG shared-secondary regression.
tests/regression_tests/mg_fixed_source_ww_fission_shared_secondary/init.py Package marker for new MG regression directory.
src/tallies/tally_scoring.cpp Adjust IFP source-bank indexing to align with new 0-based current_work().
src/tallies/filter_particle_production.cpp ParticleProductionFilter now reads from the local-secondary bank accessor.
src/simulation.cpp Add shared-secondary transport algorithms; refactor seeding/work partitioning; 0-based current_work() semantics.
src/settings.cpp Parse <shared_secondary_bank> and auto-enable it for fixed-source WW runs when not explicitly set.
src/physics.cpp Route secondaries into local-secondary bank and carry extra metadata needed for shared-secondary transport.
src/physics_mg.cpp Same as CE physics: local-secondary bank + metadata for shared-secondary transport.
src/particle.cpp Refactor secondary creation/revival for shared-secondary mode; track-counting changes.
src/particle_restart.cpp Update RNG seeding logic and shared-secondary bookkeeping for restart runs.
src/output.cpp Print “Track Rate” when weight windows are enabled.
src/initialize.cpp Add compatibility guard: disable shared-secondary bank when pulse-height tallies are present.
src/ifp.cpp Adjust IFP source-bank indexing to align with new 0-based current_work().
src/finalize.cpp Reset shared-secondary-related settings/state on finalize/reset.
src/event.cpp Factor event-kernel loop into helper and add init path for shared-secondary event transport.
src/bank.cpp Add shared secondary banks + sorting + MPI redistribution helper for secondary generations.
openmc/settings.py Add Python-side Settings.shared_secondary_bank with XML read/write support.
openmc/lib/core.py Extend C-API _SourceSite struct mapping to include added SourceSite fields.
include/openmc/simulation.h Update simulation API for new work partitioning, seeding helpers, and shared-secondary transport entry points.
include/openmc/shared_array.h Change SharedArray constructor semantics + add thread_unsafe_append.
include/openmc/settings.h Add use_shared_secondary_bank setting.
include/openmc/particle.h Update particle API for revised secondary revival flow.
include/openmc/particle_data.h Extend SourceSite; rename secondary bank storage to local-secondary bank; add track counter.
include/openmc/event.h Declare shared-secondary event init + common transport loop helper.
include/openmc/bank.h Declare shared-secondary banks, generalized sort function, and MPI redistribution helper.
docs/source/io_formats/settings.rst Document <shared_secondary_bank> settings XML element.

You can also share your feedback on Copilot code review. Take the survey.

- Fix missing final track state write in event_death() when the
  secondary bank is empty (lost during event_revive refactor)
- Parametrize test_weightwindows with shared_secondary [False, True]
- Pin test_photon_heating to local mode to work around Compton
  relaxation negative heating bug (fix in separate PR)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@jtramm jtramm requested a review from pshriwise as a code owner March 12, 2026 15:50
@jtramm jtramm removed the request for review from pshriwise March 12, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants