Skip to content

Commit abfdf7b

Browse files
authored
AMDuProfCLI translate
1 parent 80748dc commit abfdf7b

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

Diff for: profile_related.md

+10-1
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,18 @@
3030
DefaultLimitMEMLOCK=65536:infinity
3131
```
3232
- If you use vnc session, ulimit -l must be adjusted before launching vnc session. The session will inherit from the current terminal and you will not be able to adjust memlock
33-
- Sample uprof command:
33+
- Collection steps:
3434
- mpirun -np 40 .../AMDuProfCLI collect --config tbp --mpi --output-dir ./PROF40 ../bin/a.exe
3535
- MPI trace only: mpirun -np 40 .../AMDuProfCLI collect --trace mpi=full --output-dir ./TMP ../bin/a.exe
36+
- TBP + mpi: mpirun -np 40 .../AMDuProfCLI collect --config tbp --trace mpi=full --output-dir ./TMP ../bin/a.exe
37+
- TBP + mpi + call graph: mpirun -np 40 .../AMDuProfCLI collect --config tbp --call-graph --trace mpi=full --output-dir ./TMP ../bin/a.exe
38+
- Translation:
39+
- Done automatically when loaded in GUI
40+
- For very large results, use batch
41+
- AMDuProfCLI translate -i ./TMP/AMDuProf-a.exe-Custom-MPI --log-path ./log_dir --enable-log --category cpu,mpi
42+
- This will run multiple-threading processes. Feed some cpus (>4) on a single node
43+
- Analysis:
44+
- Load the results from GUI of AMDuProf
3645

3746
## nvidia ncu/nsys
3847
- Basically they need sudo privilege

0 commit comments

Comments
 (0)