docs: Adding words to the refit and engine caching tutorials #3141

narendasan · 2024-09-03T20:07:26Z

Description

Adds a bunch of text explaining the two new refit features for 2.5

Fixes # (issue)

Type of change

Please delete options that are not relevant and/or add your own.

This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/engine_caching_bert_example.py	2024-09-03 20:07:41.366823+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/engine_caching_bert_example.py	2024-09-03 20:08:00.989606+00:00
@@ -5,10 +5,11 @@
Engine Caching (BERT)
=======================

Small caching example on BERT.
"""
+
import numpy as np
import torch
import torch_tensorrt
from engine_caching_example import remove_timing_cache
from transformers import BertModel
--- /home/runner/work/TensorRT/TensorRT/docsrc/conf.py	2024-09-03 20:07:41.362823+00:00
+++ /home/runner/work/TensorRT/TensorRT/docsrc/conf.py	2024-09-03 20:08:01.014045+00:00
@@ -91,11 +91,11 @@

# sphinx-gallery configuration
sphinx_gallery_conf = {
    "examples_dirs": "../examples",
    "gallery_dirs": "tutorials/_rendered_examples/",
-    "ignore_pattern": "utils.py"
+    "ignore_pattern": "utils.py",
}

# Setup the breathe extension
breathe_projects = {"Torch-TensorRT": "./_tmp/xml"}
breathe_default_project = "Torch-TensorRT"
--- /home/runner/work/TensorRT/TensorRT/examples/dynamo/engine_caching_example.py	2024-09-03 20:07:41.366823+00:00
+++ /home/runner/work/TensorRT/TensorRT/examples/dynamo/engine_caching_example.py	2024-09-03 20:08:01.131264+00:00
@@ -45,10 +45,11 @@


def remove_timing_cache(path=TIMING_CACHE_PATH):
    if os.path.exists(path):
        os.remove(path)
+

# %%
# Engine Caching for JIT Compilation
# ----------------------------------
#
@@ -61,10 +62,11 @@
# engines are saved to disk tied to a hash of their corresponding PyTorch subgraph. If
# in a subsequent compilation, either as part of this session or a new session, the cache will
# pull the built engine and **refit** the weights which can reduce compilation times by orders of magnitude.
# As such, in order to insert a new engine into the cache (i.e. ``cache_built_engines=True``),
# the engine must be refitable (``make_refittable=True``). See :ref:`refit_engine_example` for more details.
+

def torch_compile(iterations=3):
    times = []
    start = torch.cuda.Event(enable_timing=True)
    end = torch.cuda.Event(enable_timing=True)
@@ -108,18 +110,20 @@
    print("----------------torch_compile----------------")
    print("disable engine caching, used:", times[0], "ms")
    print("enable engine caching to cache engines, used:", times[1], "ms")
    print("enable engine caching to reuse engines, used:", times[2], "ms")

+
torch_compile()

# %%
# Engine Caching for AOT Compilation
# ----------------------------------
# Similarly to the JIT workflow, AOT workflows can benefit from engine caching.
# As the same architecture or common subgraphs get recompiled, the cache will pull
# previously built engines and refit the weights.
+

def dynamo_compile(iterations=3):
    times = []
    start = torch.cuda.Event(enable_timing=True)
    end = torch.cuda.Event(enable_timing=True)
@@ -166,10 +170,11 @@
    print("----------------dynamo_compile----------------")
    print("disable engine caching, used:", times[0], "ms")
    print("enable engine caching to cache engines, used:", times[1], "ms")
    print("enable engine caching to reuse engines, used:", times[2], "ms")

+
dynamo_compile()

# %%
# Custom Engine Cache
# ----------------------
@@ -185,10 +190,11 @@
#
# The hash provided by the cache systen is a weight agnostic hash of the originating PyTorch subgraph (post lowering).
# The blob contains a serialized engine, calling spec data, and weight map information in the pickle format
#
# Below is an example of a custom engine cache implementation that implents a ``RAMEngineCache``.
+

class RAMEngineCache(BaseEngineCache):
    def __init__(
        self,
    ) -> None:
@@ -276,6 +282,7 @@
    print("----------------torch_compile----------------")
    print("disable engine caching, used:", times[0], "ms")
    print("enable engine caching to cache engines, used:", times[1], "ms")
    print("enable engine caching to reuse engines, used:", times[2], "ms")

+
torch_compile_my_cache()
--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_engine_cache.py	2024-09-03 20:07:41.378823+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/_engine_cache.py	2024-09-03 20:08:01.555967+00:00
@@ -142,11 +142,13 @@
            engine_cache_dir
        )
        if engine_cache_dir not in DiskEngineCache.dir2hash2size_map:
            DiskEngineCache.dir2hash2size_map[engine_cache_dir] = {}

-        _LOGGER.info(f"Disk engine cache initialized (cache directory:{self.engine_cache_dir}, max size: {self.total_engine_cache_size})")
+        _LOGGER.info(
+            f"Disk engine cache initialized (cache directory:{self.engine_cache_dir}, max size: {self.total_engine_cache_size})"
+        )

    def has_available_cache_size(self, needed_size: int) -> bool:
        """Check if the cache has available space for saving object

        Args:

Signed-off-by: Naren Dasan <[email protected]>

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/docsrc/conf.py	2024-09-03 20:34:09.598244+00:00
+++ /home/runner/work/TensorRT/TensorRT/docsrc/conf.py	2024-09-03 20:34:44.738932+00:00
@@ -91,11 +91,11 @@

# sphinx-gallery configuration
sphinx_gallery_conf = {
    "examples_dirs": "../examples",
    "gallery_dirs": "tutorials/_rendered_examples/",
-    "ignore_pattern": "utils.py"
+    "ignore_pattern": "utils.py",
}

# Setup the breathe extension
breathe_projects = {"Torch-TensorRT": "./_tmp/xml"}
breathe_default_project = "Torch-TensorRT"

Signed-off-by: Naren Dasan <[email protected]>

Re run

facebook-github-bot added the cla signed label Sep 3, 2024

github-actions bot requested a review from bowang007 September 3, 2024 20:07

github-actions bot requested changes Sep 3, 2024

View reviewed changes

docs: Adding words to the refit and engine caching tutorials

f9acc9a

Signed-off-by: Naren Dasan <[email protected]>

narendasan force-pushed the docs_update_refit branch from a4684e3 to d5e3a77 Compare September 3, 2024 20:33

github-actions bot previously requested changes Sep 3, 2024

View reviewed changes

docs: Adding words to the refit and engine caching tutorials

8492c62

Signed-off-by: Naren Dasan <[email protected]>

narendasan force-pushed the docs_update_refit branch from d5e3a77 to 8492c62 Compare September 3, 2024 20:55

narendasan merged commit 8759736 into main Sep 4, 2024
67 checks passed

narendasan deleted the docs_update_refit branch September 4, 2024 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Adding words to the refit and engine caching tutorials #3141

docs: Adding words to the refit and engine caching tutorials #3141

narendasan commented Sep 3, 2024

github-actions bot left a comment

github-actions bot left a comment

docs: Adding words to the refit and engine caching tutorials #3141

docs: Adding words to the refit and engine caching tutorials #3141

Conversation

narendasan commented Sep 3, 2024

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment