-
Notifications
You must be signed in to change notification settings - Fork 217
Restarting a simulation
You are here: Home > PIConGPU User Documentation > Restarting a simulation
PIConGPU supports restarting the simulation run from checkpoints stored on disk.
Checkpoints are special dumps of the simulation data that contain all information required for a restart. This may include internal data that is not required/intended for post-processing or analysis. To enable checkpoints, add the
--checkpoints <period>
flag to the PIConGPU command line, specifying the period with which checkpoints should be created.
Plugins will receive a special notification for a checkpoint every --checkpoints
steps in addition to the standard notification, triggered every <plugin>.period
steps. Note that some plugins might specify additional parameters for checkpoints which must be set to enable checkpointing for this plugin.
Since restarts require most field and particle data, the HDF5Writer plugin must be enabled. Whenever a standard output notification and a checkpoint notification are triggered for HDF5Writer for the same timestep, both the checkpoint and the standard output are written. For information on other plugins, see their documentation.
Restarting PIConGPU requires that checkpoints are created as shown in the above section and the HDF5Writer plugin is enabled. In this case, set the following flags:
--restart --restart-step <checkpoint step>
Additional plugin-specific flags are necessary to enable restarts. For HDF5Writer, the flag --hdf5.restart-file <filename prefix>
is required to set the checkpoint fileset from which you want to restart. For information on other plugins, see their documentation.
This example shows how to set flags to create checkpoints and restarts using the HDF5Writer plugin.
Checkpointing: Run a simulation with 8 GPUs for 1024 steps, dumping results every 128 steps and checkpointing every 512 steps. Dumps and checkpoints use the same filename prefix ("simData").
-d 2 2 2 -g 256 512 256 -s 1024 --hdf5.period 128 --hdf5.file simData --checkpoints 512
Restart: Restart with the same GPU and grid configuration from the last checkpoint (1024) using the "simData" fileset and simulate another 1024 steps, up until timestep 2048.
-d 2 2 2 -g 256 512 256 -s 2048 --restart --restart-step 1024 --hdf5.restart-file simData
By default, the checkpoints
directory is created below <run>/simOutput/
. This can be changed using
--checkpoint-directory <absolute or relative directory>
which creates either /my/absolute/directory
or <run>/simOutput/<relative directory>
. Note that the creation of deep directory-structures is currently not supported.
Optionally, the flag hdf5.checkpoint-file <absolute or relative filename prefix>
may be set to specify a special filename for checkpoint files. If hdf5.checkpoint-file
is set with an absolute path, HDF5Writer ignores the application-wide --checkpoint-directory
setting.
For restart, the default can be modified using
--restart-directory <absolute or relative directory>
and hdf5.restart-file
for which the same as for checkpoints applies.
All wiki entries describe the dev branch. Features may be different in the current master branch.
Before you start please read our README!
PIConGPU is a scientific project. If you present and/or publish scientific results that used PIConGPU, you should set a reference to show your support. Our according up-to-date publication at the time of your publication should be inquired from:
The documentation in this wiki is still not complete and we need your help keeping it up to date. Feel free to help improving this wiki!