Skip to content

Latest commit

 

History

History
92 lines (69 loc) · 9.05 KB

README.md

File metadata and controls

92 lines (69 loc) · 9.05 KB

GAMESS ECP KPP-2 Artifacts

10-point MSN Reaction Pathway

Directory: 10-points (link)

This directory contains the inputs and outputs for the GAMESS ECP KPP-2 science challenge problem consisting of 10-points along a representative reaction path on a mesoporous silica nanoparticle (MSN) that includes an amine-containing catalytic group and is surrounded by 4,600 water molecules [13,800 atoms; 46,000 electrons], all within the effective fragment molecular orbital (EFMO) method. Each of the 10 points along the reaction path used 5,430 Frontier nodes.

Identifier Filename Total Wall time (s)
msn-r (link) msn-r_4600water_efmo_rimp2.rcut10.log 1118.7
msn-r1ts (link) msn-r1ts_4600water_efmo_rimp2.rcut10.log 1655.1
msn-r2ts (link) msn-r2ts_4600water_efmo_rimp2.rcut10.log 1655.7
msn-r3ts (link) msn-r3ts_4600water_efmo_rimp2.rcut10.log 1686.3
msn-ts (link) msn-ts_4600water_efmo_rimp2.rcut10.log 1672.2
msn-ts1p (link) msn-ts1p_4600water_efmo_rimp2.rcut10.log 1687.5
msn-ts2p (link) msn-ts2p_4600water_efmo_rimp2.rcut10.log 1667.7
msn-ts3p (link) msn-ts3p_4600water_efmo_rimp2.rcut10.log 1664.4
msn-ts4p (link) msn-ts4p_4600water_efmo_rimp2.rcut10.log 1692.3
msn-p (link) msn-p_4600water_efmo_rimp2.rcut10.log 1704.5

MSN5 Hydrated

Directory: msn5-hydrated (link)

This directory contains the inputs and outputs for a small fragmented system that is tractable within the queuing policy limits on Frontier. This small fragmented system consists of 5 msn fragments from the msn-r structure with the full hydration shell. The parameter Rcut controls a distance dependent cut-off that determines how dimers are treated (QM vs. EFP). Increasing the Rcut value from 1 to 2 will result in more dimers being treated with quantum mechanics which will increase the amount of computational work available.

Rcut=1

Identifier Filename Total Wall time (s) Speed-up (x)
msn5-hyd-rcut1-cpu (link) msn_05frag_4600water_efmo_rimp2.rcut1_0128N-0128-cpu.log 8002.7
msn5-hyd-rcut1-gpu (link) msn_05frag_4600water_efmo_rimp2.rcut1_0128N-0128-gpu.log 1760.7 4.6

Rcut=2

Identifier Filename Total Wall time (s) Speed-up (x)
msn5-hyd-rcut2-cpu (link) msn_05frag_4600water_efmo_rimp2.rcut2_0128N-0128-cpu.log 10442.7
msn5-hyd-rcut2-gpu (link) msn_05frag_4600water_efmo_rimp2.rcut2_0128N-0128-gpu.log 2132.6 4.9

MSN5 Hydrated Traces

Directory: traces (link)

This directory contains the traces generated by the AMD ROC profiler rocprof for a 128 node run of MSN5 hydrated at an Rcut value of 1 on Crusher using the GAMESS offloaded code. Trace data is provided for ranks 0-7 which correspond to the GAMESS compute processes on the first node.

Kernel Traces

The --stats option triggers rocprof to generate kernel execution statistics such as kernel name, number of time a kernel is called (Calls) , time spent in the kernel in nanoseconds (TotalDurationNs), the average time spent in the kernel in nanoseconds (AverageNs), and the percent of total GPU kernel time spent in the kernel (Percentage).

Identifier Rank Filename
rank-0-kernel-trace (link) 0 trace-rank-0.stats.csv
rank-1-kernel-trace (link) 1 trace-rank-1.stats.csv
rank-2-kernel-trace (link) 2 trace-rank-2.stats.csv
rank-3-kernel-trace (link) 3 trace-rank-3.stats.csv
rank-4-kernel-trace (link) 4 trace-rank-4.stats.csv
rank-5-kernel-trace (link) 5 trace-rank-5.stats.csv
rank-6-kernel-trace (link) 6 trace-rank-6.stats.csv
rank-7-kernel-trace (link) 7 trace-rank-7.stats.csv

HIP Traces

The --hip-trace option triggers rocprof to generate statistics for HIP API calls such as host-to-device (hipMemcpyHtoD) and device-to-host (hipMemcpyDtoH) memory copies.

Identifier Rank Filename
rank-0-hip-trace (link) 0 trace-rank-0.hip_stats.csv
rank-1-hip-trace (link) 1 trace-rank-1.hip_stats.csv
rank-2-hip-trace (link) 2 trace-rank-2.hip_stats.csv
rank-3-hip-trace (link) 3 trace-rank-3.hip_stats.csv
rank-4-hip-trace (link) 4 trace-rank-4.hip_stats.csv
rank-5-hip-trace (link) 5 trace-rank-5.hip_stats.csv
rank-6-hip-trace (link) 6 trace-rank-6.hip_stats.csv
rank-7-hip-trace (link) 7 trace-rank-7.hip_stats.csv

Timing Summary

This table summarizes the timing statistics from the kernel and hip trace data generated by rocprof. The total wall time reported for each rank includes profiling overhead, 38.9 (s).

Rank Total Wall (s) Kernels + Data Transfers (s) Kernels (s) Data Transfers (s) RHF Kernels (s) CPHF Kernels (s) TDHF Kernels (s) RIMP2 Kernels (s) HIPBLAS Kernels (s) Host-to-Device (s) Device-to-Host (s)
0 1799.6 921.4 841.4 80.1 114.9 0.6 620.9 10.4 44.5 52.3 27.7
1 1799.6 977.3 890.6 86.7 111.5 3.7 662.2 13.1 50.1 58.8 27.9
2 1799.6 953.4 860.4 93.0 107.3 2.3 633.4 16.2 51.2 65.1 27.9
3 1799.6 819.7 721.6 98.2 93.2 4.2 506.0 18.8 59.4 70.1 28.1
4 1799.6 827.5 725.0 102.5 96.7 4.9 505.2 21.2 57.1 76.1 26.4
5 1799.6 892.5 782.5 110.0 100.8 6.5 550.5 22.7 61.9 82.6 27.4
6 1799.6 856.0 746.0 110.0 98.7 5.4 512.0 24.1 65.8 83.3 26.7
7 1799.6 921.7 810.4 111.3 101.5 0.5 563.1 25.8 69.5 83.9 27.5
Average 1799.6 896.2 797.2 99.0 103.1 8.5 569.2 19.0 57.4 71.5 27.5