Skip to content

Restarting a simulation

f-schmitt-zih edited this page Apr 4, 2014 · 4 revisions

You are here: Home > PIConGPU User Documentation > Restarting a simulation


PIConGPU supports restarting the simulation run from checkpoints stored on disk.

Checkpointing

Checkpoints are special dumps of the simulation data that contain all information required for a restart. This may include internal data that is not required/intended for post-processing or analysis. To enable checkpoints, add the

--checkpoints <frequency>

flag to the PIConGPU command line, specifying the frequency with which checkpoints should be created.

Plugins will receive a special notification for a checkpoint in addition to the standard notification, triggered every <plugin>.period steps. Note that some plugins might specify additional parameters for checkpoints which must be set to enable checkpointing for this plugin.

Since restarts require most field and particle data, the HDF5Writer plugin must be enabled. Optionally, the flag hdf5.checkpoint-file <filename prefix> may be set to specify a special filename for checkpoint files. Whenever a standard output notification and a checkpoint notification are triggered for HDF5Writer for the same timestep, only the checkpoint will be written. For information on other plugins, see their documentation.

Restarts

Restarting PIConGPU requires that checkpoints are created as shown in the above section and the HDF5Writer plugin is enabled. In this case, set the following flags:

--restart --restart-step <checkpoint step>

Additional plugin-specific flags are necessary to enable restarts. For HDF5Writer, the flag --hdf5.restart-file <filename prefix> is required to set the checkpoint fileset from which you want to restart. For information on other plugins, see their documentation.

Example

This example shows how to set flags to create checkpoints and restarts using the HDF5Writer plugin.

Checkpointing: Run a simulation with 8 GPUs for 1024 steps, dumping results every 128 steps and checkpointing every 512 steps. Dumps and checkpoints use the same filename prefix ("simData").

-d 2 2 2 -g 256 512 256 -s 1024 --hdf5.period 128 --hdf5.file simData --checkpoints 512

Restart: Restart with the same GPU and grid configuration from the last checkpoint (1024) using the "simData" fileset and simulate another 1024 steps, up until timestep 2048.

-d 2 2 2 -g 256 512 256 -s 2048 --restart --restart-step 1024 --hdf5.restart-file simData
Clone this wiki locally