-
Notifications
You must be signed in to change notification settings - Fork 51
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Docs: Analysis Add a data analysis & visualization section. This is meant to show entry points and workflows to work with openPMD data in larger frameworks and compatible ecosystems. * [Draft] DASK, Pandas, ... * Doc: DASK * Pandas * RAPIDS * Typos
- Loading branch information
Showing
7 changed files
with
422 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
.. _analysis-contrib: | ||
|
||
Contributed | ||
=========== | ||
|
||
This page contains contributed projects and third party integrations to analyze openPMD data. | ||
See the `openPMD-projects <https://github.com/openPMD/openPMD-projects#data-processing-and-visualization>`__ catalog for more community integrations. | ||
|
||
|
||
.. _analysis-contrib-visualpic: | ||
|
||
3D Visualization: VisualPIC | ||
--------------------------- | ||
|
||
openPMD data can be visualized with the domain-specific VisualPIC renderer. | ||
Please see `the WarpX page for details <https://warpx.readthedocs.io/en/latest/dataanalysis/visualpic.html>`__. | ||
|
||
|
||
.. _analysis-contrib-visit: | ||
|
||
3D Visualization: VisIt | ||
----------------------- | ||
|
||
openPMD **HDF5** data can be visualized with VisIt 3.1.0+. | ||
VisIt supports openPMD HDF5 files and requires to rename the files from ``.h5`` to ``.opmd`` to be automatically detected. | ||
|
||
|
||
.. _analysis-contrib-yt: | ||
|
||
yt-project | ||
---------- | ||
|
||
openPMD **HDF5** data can be visualized with `yt-project <https://yt-project.org>`__. | ||
Please see the `yt documentation <https://yt-project.org/doc/examining/loading_data.html?highlight=openpmd#openpmd-data>`__ for details. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
.. _analysis-dask: | ||
|
||
DASK | ||
==== | ||
|
||
The Python bindings of openPMD-api provide direct methods to load data into the parallel, `DASK data analysis ecosystem <https://www.dask.org>`__. | ||
|
||
|
||
How to Install | ||
-------------- | ||
|
||
Among many package managers, `PyPI <https://pypi.org/project/dask/>`__ ships the latest packages of DASK: | ||
|
||
.. code-block:: python | ||
python3 -m pip install -U dask | ||
python3 -m pip install -U pyarrow | ||
How to Use | ||
---------- | ||
|
||
The central Python API calls to convert to DASK datatypes are the ``ParticleSpecies.to_dask`` and ``Record_Component.to_dask_array`` methods. | ||
|
||
.. code-block:: python | ||
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only) | ||
electrons = s.iterations[400].particles["electrons"] | ||
# the default schedulers are local/threaded. We can also use local | ||
# "processes" or for multi-node "distributed", among others. | ||
dask.config.set(scheduler='processes') | ||
df = electrons.to_dask() | ||
type(df) # ... | ||
E = s.iterations[400].meshes["E"] | ||
E_x = E["x"] | ||
darr_x = E_x.to_dask_array() | ||
type(darr_x) # ... | ||
# note: no series.flush() needed | ||
Example | ||
------- | ||
|
||
A detailed example script for particle and field analysis is documented under as ``11_particle_dataframe.py`` in our :ref:`examples <usage-examples>`. | ||
|
||
See a video of openPMD on DASK in action in `pull request #963 <https://github.com/openPMD/openPMD-api/pull/963#issuecomment-873350174>`__ (part of openPMD-api v0.14.0 and later). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
.. _analysis-pandas: | ||
|
||
Pandas | ||
====== | ||
|
||
The Python bindings of openPMD-api provide direct methods to load data into the `Pandas data analysis ecosystem <https://pandas.pydata.org>`__. | ||
|
||
Pandas computes on the CPU, for GPU-accelerated data analysis see :ref:`RAPIDS <analysis-rapids>`. | ||
|
||
|
||
.. _analysis-pandas-install: | ||
|
||
How to Install | ||
-------------- | ||
|
||
Among many package managers, `PyPI <https://pypi.org/project/pandas/>`__ ships the latest packages of pandas: | ||
|
||
.. code-block:: python | ||
python3 -m pip install -U pandas | ||
.. _analysis-pandas-df: | ||
|
||
Dataframes | ||
---------- | ||
|
||
The central Python API call to convert to openPMD particles to a Pandas dataframe is the ``ParticleSpecies.to_df`` method. | ||
|
||
.. code-block:: python | ||
import openpmd_api as io | ||
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only) | ||
electrons = s.iterations[400].particles["electrons"] | ||
df = electrons.to_df() | ||
type(df) # pd.DataFrame | ||
print(df) | ||
# note: no series.flush() needed | ||
One can also combine all iterations in a single dataframe like this: | ||
|
||
.. code-block:: python | ||
import pandas as pd | ||
df = pd.concat( | ||
( | ||
s.iterations[i].particles["electrons"].to_df().assign(iteration=i) | ||
for i in s.iterations | ||
), | ||
axis=0, | ||
ignore_index=True, | ||
) | ||
# like before but with a new column "iteration" and all particles | ||
print(df) | ||
.. _analysis-pandas-ascii: | ||
|
||
openPMD to ASCII | ||
---------------- | ||
|
||
Once converted to a Pandas dataframe, export of openPMD data to text is very simple. | ||
We generally do not recommend this because ASCII processing is slower, uses significantly more space on disk and has less precision than the binary data usually stored in openPMD data series. | ||
Nonetheless, in some cases and especially for small, human-readable data sets this can be helpful. | ||
|
||
The central Pandas call for this is `DataFrame.to_csv <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html>`__. | ||
|
||
.. code-block:: python | ||
# creates a electrons.csv file | ||
df.to_csv("electrons.csv", sep=",", header=True) | ||
.. _analysis-pandas-sql: | ||
|
||
openPMD as SQL Database | ||
----------------------- | ||
|
||
Once converted to a Pandas dataframe, one can query and process openPMD data also with `SQL syntax <https://en.wikipedia.org/wiki/SQL>`__ as provided by many databases. | ||
|
||
A project that provides such syntax is for instance `pandasql <https://github.com/yhat/pandasql/>`__. | ||
|
||
.. code-block:: python | ||
python3 -m pip install -U pandasql | ||
or one can `export into an SQL database <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html>`__. | ||
|
||
|
||
.. _analysis-pandas-example: | ||
|
||
Example | ||
------- | ||
|
||
A detailed example script for particle and field analysis is documented under as ``11_particle_dataframe.py`` in our :ref:`examples <usage-examples>`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
.. _analysis-paraview: | ||
|
||
3D Visualization: ParaView | ||
========================== | ||
|
||
openPMD data can be visualized by ParaView, an open source visualization and analysis software. | ||
ParaView can be downloaded and installed from httpshttps://www.paraview.org. | ||
Use the latest version for best results. | ||
|
||
Tutorials | ||
--------- | ||
|
||
ParaView is a powerful, general parallel rendering program. | ||
If this is your first time using ParaView, consider starting with a tutorial. | ||
|
||
* https://www.paraview.org/Wiki/The_ParaView_Tutorial | ||
* https://www.youtube.com/results?search_query=paraview+introduction | ||
* https://www.youtube.com/results?search_query=paraview+tutorial | ||
|
||
|
||
openPMD | ||
------- | ||
|
||
openPMD files can be visualized with ParaView 5.9+, using 5.11+ is recommended. | ||
ParaView supports ADIOS1, ADIOS2 and HDF5 files, as it implements against the Python bindings of openPMD-api. | ||
|
||
For openPMD output to be recognized, create a small textfile with ``.pmd`` ending per data series, which can be opened with ParaView: | ||
|
||
.. code-block:: console | ||
$ cat paraview.pmd | ||
openpmd_%06T.bp | ||
The file contains the same string as one would put in an openPMD ``Series("....")`` object. | ||
|
||
.. tip:: | ||
|
||
When you first open ParaView, adjust its global ``Settings`` (Linux: under menu item ``Edit``). | ||
``General`` -> ``Advanced`` -> Search for ``data`` -> ``Data Processing Options``. | ||
Check the box ``Auto Convert Properties``. | ||
|
||
This will simplify application of filters, e.g., contouring of components of vector fields, without first adding a calculator that extracts a single component or magnitude. | ||
|
||
.. warning:: | ||
|
||
As of ParaView 5.11 and older, the axisLabel is not yet read for fields. | ||
See, e.g., `WarpX issue 21162 <https://github.com/ECP-WarpX/WarpX/issues/1803>`__. | ||
Please apply rotation of, e.g., ``0 -90 0`` to mesh data where needed. | ||
|
||
.. warning:: | ||
|
||
`ParaView issue 21837 <https://gitlab.kitware.com/paraview/paraview/-/issues/21837>`__: | ||
In order to visualize particle traces with the ``Temporal Particles To Pathlines``, you need to apply the ``Merge Blocks`` filter first. | ||
|
||
If you have multiple species, you may have to extract the species you want with ``Extract Block`` before applying ``Merge Blocks``. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
.. _analysis-rapids: | ||
|
||
RAPIDS | ||
====== | ||
|
||
The Python bindings of openPMD-api enable easy loading into the GPU-accelerated `RAPIDS.ai datascience & AI/ML ecosystem <https://rapids.ai/>`__. | ||
|
||
|
||
.. _analysis-rapids-install: | ||
|
||
How to Install | ||
-------------- | ||
|
||
Follow the `official documentation <https://docs.rapids.ai/install>`__ to install RAPIDS. | ||
|
||
.. code-block:: python | ||
# preparation | ||
conda update -n base conda | ||
conda install -n base conda-libmamba-solver | ||
conda config --set solver libmamba | ||
# install | ||
conda create -n rapids -c rapidsai -c conda-forge -c nvidia rapids python cudatoolkit openpmd-api pandas | ||
conda activate rapids | ||
.. _analysis-rapids-cudf: | ||
|
||
Dataframes | ||
---------- | ||
|
||
The central Python API call to convert to openPMD particles to a cuDF dataframe is the ``ParticleSpecies.to_df`` method. | ||
|
||
.. code-block:: python | ||
import openpmd_api as io | ||
import cudf | ||
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only) | ||
electrons = s.iterations[400].particles["electrons"] | ||
cdf = cudf.from_pandas(electrons.to_df()) | ||
type(cdf) # cudf.DataFrame | ||
print(cdf) | ||
# note: no series.flush() needed | ||
One can also combine all iterations in a single dataframe like this: | ||
|
||
.. code-block:: python | ||
cdf = cudf.concat( | ||
( | ||
cudf.from_pandas(s.iterations[i].particles["electrons"].to_df().assign(iteration=i)) | ||
for i in s.iterations | ||
), | ||
axis=0, | ||
ignore_index=True, | ||
) | ||
# like before but with a new column "iteration" and all particles | ||
print(cdf) | ||
.. _analysis-rapids-sql: | ||
|
||
openPMD as SQL Database | ||
----------------------- | ||
|
||
Once converted to a dataframe, one can query and process openPMD data also with `SQL syntax <https://en.wikipedia.org/wiki/SQL>`__ as provided by many databases. | ||
|
||
A project that provides such syntax is for instance `BlazingSQL <https://github.com/BlazingDB/blazingsql>`__ (see the `BlazingSQL install documentation <https://github.com/BlazingDB/blazingsql#prerequisites>`__). | ||
|
||
.. code-block:: python | ||
import openpmd_api as io | ||
from blazingsql import BlazingContext | ||
s = io.Series("samples/git-sample/data%T.h5", io.Access.read_only) | ||
electrons = s.iterations[400].particles["electrons"] | ||
bc = BlazingContext(enable_progress_bar=True) | ||
bc.create_table('electrons', electrons.to_df()) | ||
# all properties for electrons > 3e11 kg*m/s | ||
bc.sql('SELECT * FROM electrons WHERE momentum_z > 3e11') | ||
# selected properties | ||
bc.sql('SELECT momentum_x, momentum_y, momentum_z, weighting FROM electrons WHERE momentum_z > 3e11') | ||
.. _analysis-rapids-example: | ||
|
||
Example | ||
------- | ||
|
||
A detailed example script for particle and field analysis is documented under as ``11_particle_dataframe.py`` in our :ref:`examples <usage-examples>`. |
Oops, something went wrong.