feat: engine caching #2995

zewenli98 · 2024-07-10T21:50:29Z

Description

Engine caching feature. More details see: #2957

Type of change

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

github-actions

There are some changes that do not conform to Python style guidelines:

--- /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py	2024-07-16 20:09:24.911867+00:00
+++ /home/runner/work/TensorRT/TensorRT/py/torch_tensorrt/dynamo/backend/backends.py	2024-07-16 20:11:22.623270+00:00
@@ -100,11 +100,13 @@

            gm = post_lowering(gm, sample_inputs)

            logger.debug("Lowered Input graph:\n " + str(gm.graph))

-            torchtrt_inputs = prepare_inputs(torch_inputs, disable_memory_format_check=True)
+            torchtrt_inputs = prepare_inputs(
+                torch_inputs, disable_memory_format_check=True
+            )
            trt_compiled = compile_module(
                gm,
                torchtrt_inputs,
                settings=settings,
            )

narendasan

Do you have tests for this?

py/torch_tensorrt/dynamo/_settings.py

py/torch_tensorrt/dynamo/_compiler.py

py/torch_tensorrt/dynamo/_engine_caching.py

peri044

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

py/torch_tensorrt/dynamo/_compiler.py

py/torch_tensorrt/dynamo/_engine_caching.py

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py

py/torch_tensorrt/dynamo/_compiler.py

zewenli98 · 2024-08-21T17:19:10Z

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

Nope, all weights are set to 0 before hashing, so if the architectures are the same they will be considered isomorphic

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py

py/torch_tensorrt/dynamo/conversion/_conversion.py

narendasan · 2024-08-26T22:55:57Z

Added some comments. Functionality looks good. One question - Does the hash change if the module has the same graph but different weights ?

Nope, all weights are set to 0 before hashing, so if the architectures are the same they will be considered isomorphic

I dont really get where we refit for functionality in the code now if we store weight stripped (or zero'd) graphs. Also if we set all weights to 0 what happens in data-dependent cases? Can we detect these cases?

zewenli98 · 2024-08-27T00:52:42Z

I dont really get where we refit for functionality in the code now if we store weight stripped (or zero'd) graphs.

The current code calls refit after interpreter.run() but I plan to move it into interpreter.run().

Also if we set all weights to 0 what happens in data-dependent cases? Can we detect these cases?

The current engine caching happens after partition.
If an input is data dependent, does it generate different TRT engines given different actual inputs?

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py

py/torch_tensorrt/dynamo/_compiler.py

revert backend changes update dynamo path add save_engine_cache and load_engine_cache args support customizing engine cache class refactor and add LRU to clear cache fix bug

narendasan

LGTM!

facebook-github-bot added the cla signed label Jul 10, 2024

zewenli98 requested review from narendasan and peri044 July 10, 2024 21:50

github-actions bot added component: conversion Issues re: Conversion stage component: api [Python] Issues re: Python API component: dynamo Issues relating to the `torch.compile` or `torch._dynamo.export` paths component: torch_compile labels Jul 10, 2024

zewenli98 force-pushed the engine_cache branch from 31165b8 to d13a46b Compare July 16, 2024 20:10

github-actions bot removed the component: torch_compile label Jul 16, 2024

github-actions bot requested changes Jul 16, 2024

View reviewed changes

zewenli98 changed the title ~~[WIP] feat: engine caching~~ feat: engine caching Jul 23, 2024

zewenli98 force-pushed the engine_cache branch from beee5b4 to fbfc863 Compare August 7, 2024 18:39

github-actions bot added component: torch_compile and removed component: torch_compile labels Aug 7, 2024

zewenli98 force-pushed the engine_cache branch 2 times, most recently from cb8d30b to 59ba4a2 Compare August 14, 2024 09:46

narendasan reviewed Aug 15, 2024

View reviewed changes

narendasan reviewed Aug 20, 2024

View reviewed changes

py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved

narendasan reviewed Aug 20, 2024

View reviewed changes

py/torch_tensorrt/dynamo/_engine_caching.py Outdated Show resolved Hide resolved

peri044 reviewed Aug 21, 2024

View reviewed changes

zewenli98 force-pushed the engine_cache branch from 748b4c6 to a86260e Compare August 24, 2024 05:03

narendasan reviewed Aug 26, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Show resolved Hide resolved

narendasan reviewed Aug 26, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_conversion.py Outdated Show resolved Hide resolved

narendasan reviewed Aug 26, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_conversion.py Outdated Show resolved Hide resolved

zewenli98 commented Aug 27, 2024

View reviewed changes

py/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py Outdated Show resolved Hide resolved

github-actions bot added the component: tests Issues re: Tests label Aug 28, 2024

zewenli98 force-pushed the engine_cache branch from 2562e2c to 6533d5c Compare August 28, 2024 02:52