Skip to content

Commit

Permalink
Add understand chapter for HIP Graphs
Browse files Browse the repository at this point in the history
  • Loading branch information
MKKnorr committed Aug 13, 2024
1 parent 91f60b6 commit 96b0172
Show file tree
Hide file tree
Showing 5 changed files with 77 additions and 1 deletion.
1 change: 1 addition & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ ltrace
makefile
Malloc
malloc
memset
multicore
multigrid
multithreading
Expand Down
2 changes: 1 addition & 1 deletion docs/how-to/programming_manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ For Linux developers, the link [here](https://github.com/ROCm/hip-tests/blob/dev
## HIP Graph
HIP graph is supported. For more details, refer to the HIP API Guide.
HIP graphs are supported. For more details, refer to the [HIP API Guide](../doxygen/html/group___graph) or the [understand section for HIP graphs](../understand/hipgraph).
## Device-Side Malloc
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ On non-AMD platforms, like NVIDIA, HIP provides header files required to support

* {doc}`./understand/programming_model`
* {doc}`./understand/hardware_implementation`
* {doc}`./understand/hipgraph`
* {doc}`./understand/amd_clr`

:::
Expand Down
1 change: 1 addition & 0 deletions docs/sphinx/_toc.yml.in
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ subtrees:
entries:
- file: understand/programming_model
- file: understand/hardware_implementation
- file: understand/hipgraph
- file: understand/amd_clr

- caption: How to
Expand Down
73 changes: 73 additions & 0 deletions docs/understand/hipgraph.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
.. meta::
:description: This chapter provides an overview over the usage of HIP graph.
:keywords: ROCm, HIP, graph, stream

.. understand_HIP_graph:
********************************************************************************
HIP graph
********************************************************************************

HIP graphs are an alternative way of executing work on a GPU. It can provide
performance benefits over repeatedly launching the same kernels in the standard
way via streams.

.. note::
The HIP graph API is currently in Beta. Some features can change and might
have outstanding issues. Not all features supported by CUDA graphs are yet
supported. For a list of all currently supported functions see the
:doc:`HIP graph API documentation<../doxygen/html/group___graph>`.

Graph format
================================================================================

A HIP graph is, like any other graph, made up of nodes and edges. The nodes of a
HIP graph represent the operations performed, while the edges mark dependencies
between those operations.

The nodes can consist of:

- empty nodes
- nested graphs
- kernel launches
- host-side function calls
- HIP memory functions (copy, memset, ...)
- HIP events
- signalling or waiting on external semaphores

HIP graph advantages
================================================================================

The standard way of launching work on GPUs via streams incurs a small overhead for each operation involved
every time. For kernels that take a considerable amount to finish, this overhead
usually is negligible, however many workloads, including scientific simulations
and AI, involve launching many relatively small kernels repeatedly for many iterations.

HIP graphs have been specifically designed to tackle this problem by only
requiring one launch from the host per iteration, and minimizing that overhead
by performing most of the initialization beforehand. Graphs may provide
additional performance benefits, by enabling optimizations that are only
possible when knowing the dependencies between the operations.

HIP graph usage
================================================================================

Using HIP graphs to execute your work requires three different steps, where the
first two are the initial setup and only need to be executed once. First the
definition of the operations (nodes) and the dependencies (edges) between them.
The second step is the instantiation of the graph. This takes care of validating
and initializing the graph, to reduce the overhead when executing the graph.

The third step is the actual execution of the graph, which then takes care of
launching all the kernels and executing the operations while respecting their
dependencies and necessary synchronizations as specified.

As HIP graphs require some set up and initialization overhead before their first
execution, they only provide a benefit for workloads that require many iterations to complete.

Setting up HIP graphs
================================================================================

HIP graphs can be created by explicitly defining them, or using stream capture.
For the available functions see the
:doc:`HIP graph API documentation<../doxygen/html/group___graph>`.

0 comments on commit 96b0172

Please sign in to comment.