-
Notifications
You must be signed in to change notification settings - Fork 51
Description
While testing the RRFS ensemble system using more nodes, we noticed that the timing improvements with more resources wasn't very good, and seemed to level off pretty significantly for higher node counts. As background, we've been running our experiments with 52 nodes, but plan to run in production with about 92 nodes. Out of curiosity we increased the node count first from 92 to 110 nodes (a 19.5% increase) but the timing only decreased by about 3.5%. Our deterministic runs not using perturbations scale quite a bit better. Wasn't sure if this is a known feature - we are running on a 3950 x 2700, 3 km grid spacing grid, and due to stability needs have to regenerate the stochastic pattern generation quite frequently (lndpint=180 - 5 time steps; sppint=36 - every time step)
| WCOSS2 nodes | wall clock (s) |
|---|---|
| 52 | 11525 |
| 92 | 8463 (from 52 to 92: ~77% more nodes, about 26.6% faster) |
| 102 | 8251 |
| 110 | 8163 (from 92 to 110: ~19% more nodes, about 3.5% faster) |
Tagging @JiliDong-NOAA as well.