-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Script: Filter & Copy Particles #1390
base: dev
Are you sure you want to change the base?
Conversation
81e7482
to
8af4bbe
Compare
Notebook that filters & copies particles and their attributes in chunk-wise (fixed slice) manner.
8af4bbe
to
930dc01
Compare
... and do only one time step.
36d52ee
to
c68fd99
Compare
for more information, see https://pre-commit.ci
The Notebook does reimplement a lot of the functionality that openPMD-pipe has already, yes. |
I realized the attribute copying changes a few attributes, some of which are then unexpected types and need to be fixed/preserved in type:
|
Added a work-around to conserve particle attribute dtypes more carefully. |
dataset.options = json.dumps(dataset_config_resizable) | ||
out_p_rc.reset_dataset(dataset) | ||
out_p_rc[out_slice] = data[accepted] | ||
output_series.flush() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid OOM for the output of the while step by using:
- BP5 and adding
- flush_target="disk"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking into the documentation on this, I notice a documentation bug.
https://openpmd-api.readthedocs.io/en/0.15.1/backends/adios2.html mentions adios2.preferred_flush_target = "disk"
while https://openpmd-api.readthedocs.io/en/0.15.1/details/backendconfig.html talks of adios2.engine.preferred_flush_target = "disk"
.
The correct one is with engine, so: series.flush("""adios2.engine.preferred_flush_target = "disk" """)
} | ||
} | ||
} | ||
dataset_config['adios2']['dataset'] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work now with ADIOS 2.9.0+
Notebook that filters & copies particles and their attributes in chunk-wise (fixed slice) manner. This is a prototype of functionality.
The filter can be a criteria on any point-wise combination of records of the current slice/chunk.
This is a common data map-reduce operation and we need to provide a API/tool for this functionality to filter very large data sets for interesting particles. Example: laser-plasma beam from a laser-plasma source, filtered by momentum and position.
We could maybe implement this in
openpmd-pipe
.Alternatively, we could parallelize this as a tool and simplify it. Our pandas converter could simplify filters and chunking for the actual data. We could implement a lazily evaluated
from_df
for writing.Or we could extent our Dask reader and even could think about parallelism.