Directory: 10-points (link)
This directory contains the inputs and outputs for the GAMESS ECP KPP-2 science challenge problem consisting of 10-points along a representative reaction path on a mesoporous silica nanoparticle (MSN) that includes an amine-containing catalytic group and is surrounded by 4,600 water molecules [13,800 atoms; 46,000 electrons], all within the effective fragment molecular orbital (EFMO) method. Each of the 10 points along the reaction path used 5,430 Frontier nodes.
Identifier | Filename | Total Wall time (s) |
---|---|---|
msn-r (link) | msn-r_4600water_efmo_rimp2.rcut10.log | 1118.7 |
msn-r1ts (link) | msn-r1ts_4600water_efmo_rimp2.rcut10.log | 1655.1 |
msn-r2ts (link) | msn-r2ts_4600water_efmo_rimp2.rcut10.log | 1655.7 |
msn-r3ts (link) | msn-r3ts_4600water_efmo_rimp2.rcut10.log | 1686.3 |
msn-ts (link) | msn-ts_4600water_efmo_rimp2.rcut10.log | 1672.2 |
msn-ts1p (link) | msn-ts1p_4600water_efmo_rimp2.rcut10.log | 1687.5 |
msn-ts2p (link) | msn-ts2p_4600water_efmo_rimp2.rcut10.log | 1667.7 |
msn-ts3p (link) | msn-ts3p_4600water_efmo_rimp2.rcut10.log | 1664.4 |
msn-ts4p (link) | msn-ts4p_4600water_efmo_rimp2.rcut10.log | 1692.3 |
msn-p (link) | msn-p_4600water_efmo_rimp2.rcut10.log | 1704.5 |
Directory: msn5-hydrated (link)
This directory contains the inputs and outputs for a small fragmented system that is tractable within the queuing policy limits on Frontier. This small fragmented system consists of 5 msn fragments from the msn-r structure with the full hydration shell. The parameter Rcut controls a distance dependent cut-off that determines how dimers are treated (QM vs. EFP). Increasing the Rcut value from 1 to 2 will result in more dimers being treated with quantum mechanics which will increase the amount of computational work available.
Identifier | Filename | Total Wall time (s) | Speed-up (x) |
---|---|---|---|
msn5-hyd-rcut1-cpu (link) | msn_05frag_4600water_efmo_rimp2.rcut1_0128N-0128-cpu.log | 8002.7 | |
msn5-hyd-rcut1-gpu (link) | msn_05frag_4600water_efmo_rimp2.rcut1_0128N-0128-gpu.log | 1760.7 | 4.6 |
Identifier | Filename | Total Wall time (s) | Speed-up (x) |
---|---|---|---|
msn5-hyd-rcut2-cpu (link) | msn_05frag_4600water_efmo_rimp2.rcut2_0128N-0128-cpu.log | 10442.7 | |
msn5-hyd-rcut2-gpu (link) | msn_05frag_4600water_efmo_rimp2.rcut2_0128N-0128-gpu.log | 2132.6 | 4.9 |
Directory: traces (link)
This directory contains the traces generated by the AMD ROC profiler rocprof
for a 128 node run of MSN5 hydrated at an Rcut value of 1 on Crusher using the GAMESS offloaded code. Trace data is provided for ranks 0-7 which correspond to the GAMESS compute processes on the first node.
The --stats
option triggers rocprof
to generate kernel execution statistics such as kernel name, number of time a kernel is called (Calls) , time spent in the kernel in nanoseconds (TotalDurationNs), the average time spent in the kernel in nanoseconds (AverageNs), and the percent of total GPU kernel time spent in the kernel (Percentage).
Identifier | Rank | Filename |
---|---|---|
rank-0-kernel-trace (link) | 0 | trace-rank-0.stats.csv |
rank-1-kernel-trace (link) | 1 | trace-rank-1.stats.csv |
rank-2-kernel-trace (link) | 2 | trace-rank-2.stats.csv |
rank-3-kernel-trace (link) | 3 | trace-rank-3.stats.csv |
rank-4-kernel-trace (link) | 4 | trace-rank-4.stats.csv |
rank-5-kernel-trace (link) | 5 | trace-rank-5.stats.csv |
rank-6-kernel-trace (link) | 6 | trace-rank-6.stats.csv |
rank-7-kernel-trace (link) | 7 | trace-rank-7.stats.csv |
The --hip-trace
option triggers rocprof
to generate statistics for HIP API calls such as host-to-device (hipMemcpyHtoD) and device-to-host (hipMemcpyDtoH) memory copies.
Identifier | Rank | Filename |
---|---|---|
rank-0-hip-trace (link) | 0 | trace-rank-0.hip_stats.csv |
rank-1-hip-trace (link) | 1 | trace-rank-1.hip_stats.csv |
rank-2-hip-trace (link) | 2 | trace-rank-2.hip_stats.csv |
rank-3-hip-trace (link) | 3 | trace-rank-3.hip_stats.csv |
rank-4-hip-trace (link) | 4 | trace-rank-4.hip_stats.csv |
rank-5-hip-trace (link) | 5 | trace-rank-5.hip_stats.csv |
rank-6-hip-trace (link) | 6 | trace-rank-6.hip_stats.csv |
rank-7-hip-trace (link) | 7 | trace-rank-7.hip_stats.csv |
This table summarizes the timing statistics from the kernel and hip trace data generated by rocprof
. The total wall time reported for each rank includes profiling overhead, 38.9 (s).
Rank | Total Wall (s) | Kernels + Data Transfers (s) | Kernels (s) | Data Transfers (s) | RHF Kernels (s) | CPHF Kernels (s) | TDHF Kernels (s) | RIMP2 Kernels (s) | HIPBLAS Kernels (s) | Host-to-Device (s) | Device-to-Host (s) |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1799.6 | 921.4 | 841.4 | 80.1 | 114.9 | 0.6 | 620.9 | 10.4 | 44.5 | 52.3 | 27.7 |
1 | 1799.6 | 977.3 | 890.6 | 86.7 | 111.5 | 3.7 | 662.2 | 13.1 | 50.1 | 58.8 | 27.9 |
2 | 1799.6 | 953.4 | 860.4 | 93.0 | 107.3 | 2.3 | 633.4 | 16.2 | 51.2 | 65.1 | 27.9 |
3 | 1799.6 | 819.7 | 721.6 | 98.2 | 93.2 | 4.2 | 506.0 | 18.8 | 59.4 | 70.1 | 28.1 |
4 | 1799.6 | 827.5 | 725.0 | 102.5 | 96.7 | 4.9 | 505.2 | 21.2 | 57.1 | 76.1 | 26.4 |
5 | 1799.6 | 892.5 | 782.5 | 110.0 | 100.8 | 6.5 | 550.5 | 22.7 | 61.9 | 82.6 | 27.4 |
6 | 1799.6 | 856.0 | 746.0 | 110.0 | 98.7 | 5.4 | 512.0 | 24.1 | 65.8 | 83.3 | 26.7 |
7 | 1799.6 | 921.7 | 810.4 | 111.3 | 101.5 | 0.5 | 563.1 | 25.8 | 69.5 | 83.9 | 27.5 |
Average | 1799.6 | 896.2 | 797.2 | 99.0 | 103.1 | 8.5 | 569.2 | 19.0 | 57.4 | 71.5 | 27.5 |