-
Notifications
You must be signed in to change notification settings - Fork 20
Description
test/testinput/3dvar_O30kmIE60km_ColdStart.yaml hangs (and is killed by the PBS scheduler) when running the Variational1 step.
The hang is caused by an additional variable, refl10cm, being added to the definition of the da_state stream in MPAS-Model. The variational application hangs when trying to update the analysis file, which is generated by the mpas_atmosphere app in the ColdForecast step.
Note this doesn't happen when running the 3dvar_OIE120km_ColdStart.yaml test scenario. It appears to be related to using a hybrd 30km/60km resolution.
The underlying cause of this is unknown.
The hang goes away when building MPAS-Model with the refl10cm variable removed from the da_state stream definition. The problem also goes away when adding the Time variable to the definition of the da_state stream! The problem also went away when adding the variable rho_zz(Time, nCells, nVertLevels) to the da_state stream.
To Reproduce
Steps to reproduce the behavior:
- Run the
test/testinput/3dvar_O30kmIE60km_ColdStart.yamlscenario using the develop branch of MPAS-Workflow (which uses the binaries created with the develop branch of mpas-bundle). The variational1 step times out.
Expected behavior
This test should work.
Additional context
The hang occurs when using either the SMIOL I/O layer or the PIO I/O layer in MPAS-Model.
Changing the order of the variables in the da_state stream did nothing.
The MPAS-Model SMIOL code was instrumented with additional logging. These logs showed that all of the mpi ranks hung when making calls to ncmpi_enddef.
See the feature/da_state_add_Time branch of https://github.com/jim-p-w/MPAS-Model for an example of a workaround.