Skip to content

Latest commit

 

History

History
278 lines (190 loc) · 22.4 KB

ROCm.rst

File metadata and controls

278 lines (190 loc) · 22.4 KB

Quick Start Guide

The Release Notes for the ROCm Latest version.
This guide discusses how to install and check for correct operation of ROCm using AMD ROCm Repository.
This guide discusses how to install and check for correct operation of ROCm using Debian repository on Ubuntu.
This guide describes how to install and check for correct operation of ROCm using yum on RHEL and CentOS 7.5.
This guide discusses how to modify the open source code base and rebuild the components of ROCm latest version.
This guide discusses how to install ROCm Kernel into the system.
This section provides answers for various frequently asked questions related to installation steps and issues faced during installation.
This guide provides a detailed discussion of the ROCm programming model and programming interface. It describes the hardware implementation and provides guidance on how to achieve maximum performance.
The appendices include a list of all ROCm-enabled devices, detailed description of all extensions to the C language, listings of supported mathematical functions, C++ features supported in host and device code, technical specifications of various devices, and concludes by introducing the low-level driver API.
This guide provides information on different ROCm languages. ROCm stack offers multiple programming-language choices found in this section.
This guide provides a detailed discussion on The Heterogeneous Compute programming installation requirements, methods to install on various platforms and how to build it from source
This section deals with detailed working with HCC, build the program, Build-in Macros, HCC Profiler mode and API Documentaion.
This guide provides a detailed discussion of The HIP programming, installation requirements, methods to install on various platfroms and how to build it from source
This section Provides details regarding various concepts of HIP Porting, Debugging, Bugs, FAQ and other aspects of the HIP.
This guide provides a detailed discussion of The OpenCL Architecture, AMD Implementation, Profiling, and other aspects of Opencl.
This section provides information on Performance and optimization for various device types such as GCN devices.
-- In-Progress
-- In-Progress
-- In-Progress
This Section gives information on ISA Manual for Hawaii (Sea Islands Series Instruction Set Architecture)
This Section gives information on ISA Manual for Fiji and Polaris (AMD Accelerated Parallel Processing technology)
This section provides “Vega” Instruction Set Architecture, Program Organization, Mode register and more details.
This section covers various concepts of AMDGCN Assembly, DS Permute Instructions, Parameters to a Kernel, GPR Counting.
Here API References are listed out for users

ROCr System Runtime API Details are listed here

HCC Language Runtime APIDetails are listed here
HIP Language Runtime API Details are listed here
Here HIP Math API are listed with sample working classes
Here you can find all the details on installation, working of Thrust Library and Thrust API List
HIP MAth API with hcRNG, clBLAS, clSPARSE API's.
Here MIOpen API and MIOpenGEMM API are listed.
Complete description of Heterogeneous Compute Compiler has been listed and documented.
This Section provides details regarding GCN in-detail.
In this Section, information related to AMDGPU ISA assembler is documented.
Complete Documentaion of ROCm-GDB tool is provided here. Installtion, Build steps and working of Debugger and API related to it has been documented here.
This section gives details on Radeon Compute Profiler- performance analysis tool,and we have details on how to clone and use it.
This section gives Details on ROCm Tracer, which provides a generic independent from specific runtime profiler to trace API and asyncronous activity. Here we have details on library source tree, steps to build and run test.
This section provides details on CodeXL, a comprehensive tool suite. The Documentaion of Installation and builds and other details related to Codexl is given.
This section provides details on GPU Performance API. The content related to how to clone, system requiments and source code directory layout can be found.
-- In-progess
This section provides details on AOMP, a scripted build of LLVM and supporting software. It has support for OpenMP target offload on AMD GPUs. Since AOMP is a clang/llvm compiler, it also supports GPU offloading with HIP, CUDA, and OpenCL.
This section provides details on ROCm Validation Suite (RVS), a system administrator’s and cluster manager’s tool for detecting and troubleshooting common problems affecting AMD GPU(s) running in a high-performance computing environment, enabled using the ROCm software stack on a compatible platform.
This section provides details on rocFFT,it is a AMD's software library compiled with the CUDA compiler using HIP tools for running on Nvidia GPU devices.
This section provides details on rocBLAS, it is a library for BLAS on ROCm.rocBLAS is implemented in the HIP programming language and optimized for AMD’s latest discrete GPUs.
This section provides details on hipBLAS, it is a BLAS marshalling library, with multiple supported backends. hipBLAS exports an interface that does not require the client to change. Currently,it supports :ref:`rocblas` and cuBLAS as backends.
This section provides details on hcRNG. It is a software library ,where uniform random number generators targeting the AMD heterogeneous hardware via HCC compiler runtime is implemented..
This section provides details on Eigen.It is a C++ template library which provides linear algebra for matrices, vectors, numerical solvers, and related algorithms.
This section provides details on clFFT.It is a software library which contains FFT functions written in OpenCL,and clFFt also supports running on CPU devices to facilitate debugging and heterogeneous programming.
This section provides details on clBLAS. It makes easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing.
This section provides details on clSPARSE, it is an OpenCL library which implements Sparse linear algebra routines.
This section provides details on clRNG,This is a library for uniform random number generation in OpenCL.
This section provides details on hcFFT, it hosts the HCC based FFT Library and targets GPU acceleration of FFT routines on AMD devices.
This section provides details on Tensile. It is a tool for creating a benchmark-driven backend library for GEMMs,N-dimensional tensor contractions and multiplies two multi-dimensional objects together on a GPU.
This section provides details on rocALUTION. It is a sparse linear algebra library with focus on exploring fine-grained parallelism, targeting modern processors and accelerators including multi/many-core CPU and GPU platforms. It can be seen as middle-ware between different parallel backends and application specific packages.
This section provides details on rocSPARSE.It is a library that contains basic linear algebra subroutines for sparse matrices and vectors written in HiP for GPU devices. It is designed to be used from C and C++ code.
This section provides details on rocThrust. It is a parallel algorithmn library.
hipCUB This section provides details on hipCUB.
It is a thin wrapper library on top of rocPRIM or CUB. It enables developers to port the project using CUB library to the HIP layer and to
run them on AMD hardware.
ROCm SMI Library This section provides details on ROCm SMI library. The ROCm System Management Interface Library, or ROCm SMI library is part of the Radeon Open Compute ROCm software stack. It is a C library for linux that provides a user space interface for applications to monitor and control GPU aplications.
RCCL This section provides details on ROCm Communications Collectives Library. It is a stand alone library of standard collective communication routines for GPUS, implememting all-reduce, all gather, reduce, broadcast, and reduce scatter.

This section provides information on AMD’s graph optimization engine.

This section provide complete description on LLVM such as introduction, Code Object, Code conventions, Source languages, etc.,
This section describes about application binary interface (ABI) provided by the AMD, implementation of the HSA runtime. It also provides details on Kernel, AMD Queue and Signals.
Documentation on instruction related to ROCm Device Library overview,Building and Testing related information with respect to Device Library is provided.
This section refers the user-mode API interfaces and libraries necessary for host applications to launch compute kernels to available HSA ROCm kernel agents. we can find installation details and Infrastructure details related to ROCr.
ROCm System Management Interface a complete guide to use and work with rocm-smi tool.
This section provides information on sysfs file structure with details related to file structure related to system are captured in sysfs.
KFD Kernel Topology is the system file structure which describes about AMD GPU related information such as nodes, Memory, Cache and IO-links.
Here PCIe Passthrough on KVM is described. A KVM-based instructions assume a headless host with an input/output memory management unit (IOMMU) to pass peripheral devices such as a GPU to guest virtual machines.more information can be found on the same here.
A framework for building the software layers defined in the Radeon Open Compute Platform into portable docker images. Detailed Information related to ROCm-Docker can be found.
ROCmRDMA is the solution designed to allow third-party kernel drivers to utilize DMA access to the GPU memory. Complete indoemation related to ROCmRDMA is Documented here.
This section gives information related to UCX, How to install, Running UCX and much more
This section gives information related to MPI.
This section gives information related to IPC.
This section provides details on ROCm Deep Learning concepts.
The porting guide highlights the key differences between the current cuDNN and MIOpen APIs.
This section provides detailed chart of Frameworks supported by ROCm and repository details.
Here Tutorials on different DeepLearning Frameworks are documented.
Here in this section we have details regardinf various system related debugs and commands for isssues faced while using ROCm.
This section Provide details related to few Concepts of HIP and other sections.
ROCm Glossary gives highlight concept and their main concept of how they work.