diff --git a/Dockerfile b/Dockerfile index a16e362..fa39536 100644 --- a/Dockerfile +++ b/Dockerfile @@ -12,7 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. -ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:20.09-py3 +ARG BASE_IMAGE=nvcr.io/nvidia/pytorch:20.12-py3 ############################################################################ ## Install PyProf diff --git a/README.rst b/README.rst index 20b41ff..408c72f 100644 --- a/README.rst +++ b/README.rst @@ -18,11 +18,42 @@ PyProf - PyTorch Profiling tool =============================== - **NOTE: You are currently on teh r20.12 branch which tracks stabilization - towards the release. This branch is not usable during stabilization** - .. overview-begin-marker-do-not-remove +PyProf is a tool that profiles and analyzes the GPU performance of PyTorch +models. PyProf aggregates kernel performance from `Nsight Systems +`_ or `NvProf +`_ and provides the +following additional features: + +What's New in 3.7.0 +------------------- + +* Monkey patching support for APEX libraries. + +Features +-------- + +* Identifies the layer that launched a kernel: e.g. the association of + `ComputeOffsetsKernel` with a concrete PyTorch layer or API is not obvious. + +* Identifies the tensor dimensions and precision: without knowing the tensor + dimensions and precision, it's impossible to reason about whether the actual + (silicon) kernel time is close to maximum performance of such a kernel on + the GPU. Knowing the tensor dimensions and precision, we can figure out the + FLOPs and bandwidth required by a layer, and then determine how close to + maximum performance the kernel is for that operation. + +* Forward-backward correlation: PyProf determines what the forward pass step + is that resulted in the particular weight and data gradients (wgrad, dgrad), + which makes it possible to determine the tensor dimensions required by these + backprop steps to assess their performance. + +* Determines Tensor Core usage: PyProf can highlight the kernels that use + `Tensor Cores `_. + +* Correlate the line in the user's code that launched a particular kernel (program trace). + .. overview-end-marker-do-not-remove Quick Installation Instructions @@ -75,5 +106,57 @@ Quick Start Instructions .. quick-start-end-marker-do-not-remove +Documentation +------------- + +The User Guide can be found in the +`documentation for current release +`_, and +provides instructions on how to install and profile with PyProf. + +A complete `Quick Start Guide `_ +provides step-by-step instructions to get you quickly started using PyProf. + +An `FAQ `_ provides +answers for frequently asked questions. + +The `Release Notes +`_ +indicate the required versions of the NVIDIA Driver and CUDA, and also describe +which GPUs are supported by PyProf + +Presentation and Papers +^^^^^^^^^^^^^^^^^^^^^^^ + +* `Automating End-toEnd PyTorch Profiling `_. + * `Presentation slides `_. + +Contributing +------------ + +Contributions to PyProf are more than welcome. To +contribute make a pull request and follow the guidelines outlined in +the `Contributing `_ document. + +Reporting problems, asking questions +------------------------------------ + +We appreciate any feedback, questions or bug reporting regarding this +project. When help with code is needed, follow the process outlined in +the Stack Overflow (https://stackoverflow.com/help/mcve) +document. Ensure posted examples are: + +* minimal – use as little code as possible that still produces the + same problem + +* complete – provide all parts needed to reproduce the problem. Check + if you can strip external dependency and still show the problem. The + less time we spend on reproducing problems the more time we have to + fix it + +* verifiable – test the code you're about to provide to make sure it + reproduces the problem. Remove all other problems that are not + related to your request/question. + .. |License| image:: https://img.shields.io/badge/License-Apache2-green.svg :target: http://www.apache.org/licenses/LICENSE-2.0 diff --git a/docs/install.rst b/docs/install.rst index 97dc516..97e0e5a 100644 --- a/docs/install.rst +++ b/docs/install.rst @@ -48,6 +48,6 @@ the most recent version of CUDA, Docker, and nvidia-docker. After performing the above setup, you can pull the PyProf container using the following command:: - docker pull nvcr.io/nvidia/pytorch:20.10-py3 + docker pull nvcr.io/nvidia/pytorch:20.12-py3 -Replace *20.10* with the version of PyTorch container that you want to pull. +Replace *20.12* with the version of PyTorch container that you want to pull. diff --git a/docs/quickstart.rst b/docs/quickstart.rst index 50dee2f..b245b23 100644 --- a/docs/quickstart.rst +++ b/docs/quickstart.rst @@ -39,7 +39,7 @@ Prerequisites drop down button. After cloning the repo be sure to select the r release branch that corresponds to the version of PyProf want to use:: - $ git checkout r20.10 + $ git checkout r20.12 * If you are starting with a pre-built NGC container, you will need to install Docker and nvidia-docker. For DGX users, see `Preparing to use NVIDIA Containers @@ -75,7 +75,7 @@ the GitHub repo and checkout the release version of the branch that you want to build (or the master branch if you want to build the under-development version):: - $ git checkout r20.10 + $ git checkout r20.12 Then use docker to build::