-
Notifications
You must be signed in to change notification settings - Fork 538
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add understand chapter for HIP Graphs
- Loading branch information
Showing
5 changed files
with
77 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,6 +72,7 @@ ltrace | |
makefile | ||
Malloc | ||
malloc | ||
memset | ||
multicore | ||
multigrid | ||
multithreading | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
.. meta:: | ||
:description: This chapter provides an overview over the usage of HIP graph. | ||
:keywords: ROCm, HIP, graph, stream | ||
|
||
.. understand_HIP_graph: | ||
******************************************************************************** | ||
HIP graph | ||
******************************************************************************** | ||
|
||
HIP graphs are an alternative way of executing work on a GPU. It can provide | ||
performance benefits over repeatedly launching the same kernels in the standard | ||
way via streams. | ||
|
||
.. note:: | ||
The HIP graph API is currently in Beta. Some features can change and might | ||
have outstanding issues. Not all features supported by CUDA graphs are yet | ||
supported. For a list of all currently supported functions see the | ||
:doc:`HIP graph API documentation<../doxygen/html/group___graph>`. | ||
|
||
Graph format | ||
================================================================================ | ||
|
||
A HIP graph is, like any other graph, made up of nodes and edges. The nodes of a | ||
HIP graph represent the operations performed, while the edges mark dependencies | ||
between those operations. | ||
|
||
The nodes can consist of: | ||
|
||
- empty nodes | ||
- nested graphs | ||
- kernel launches | ||
- host-side function calls | ||
- HIP memory functions (copy, memset, ...) | ||
- HIP events | ||
- signalling or waiting on external semaphores | ||
|
||
HIP graph advantages | ||
================================================================================ | ||
|
||
The standard way of launching work on GPUs via streams incurs a small overhead for each operation involved | ||
every time. For kernels that take a considerable amount to finish, this overhead | ||
usually is negligible, however many workloads, including scientific simulations | ||
and AI, involve launching many relatively small kernels repeatedly for many iterations. | ||
|
||
HIP graphs have been specifically designed to tackle this problem by only | ||
requiring one launch from the host per iteration, and minimizing that overhead | ||
by performing most of the initialization beforehand. Graphs may provide | ||
additional performance benefits, by enabling optimizations that are only | ||
possible when knowing the dependencies between the operations. | ||
|
||
HIP graph usage | ||
================================================================================ | ||
|
||
Using HIP graphs to execute your work requires three different steps, where the | ||
first two are the initial setup and only need to be executed once. First the | ||
definition of the operations (nodes) and the dependencies (edges) between them. | ||
The second step is the instantiation of the graph. This takes care of validating | ||
and initializing the graph, to reduce the overhead when executing the graph. | ||
|
||
The third step is the actual execution of the graph, which then takes care of | ||
launching all the kernels and executing the operations while respecting their | ||
dependencies and necessary synchronizations as specified. | ||
|
||
As HIP graphs require some set up and initialization overhead before their first | ||
execution, they only provide a benefit for workloads that require many iterations to complete. | ||
|
||
Setting up HIP graphs | ||
================================================================================ | ||
|
||
HIP graphs can be created by explicitly defining them, or using stream capture. | ||
For the available functions see the | ||
:doc:`HIP graph API documentation<../doxygen/html/group___graph>`. |