You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Compile an ExportedProgram module for NVIDIA GPUs using TensorRT
@@ -141,6 +142,7 @@ def compile(
141
142
dryrun (bool): Toggle for "Dryrun" mode, running everything except conversion to TRT and logging outputs
142
143
hardware_compatible (bool): Build the TensorRT engines compatible with GPU architectures other than that of the GPU on which the engine was built (currently works for NVIDIA Ampere and newer)
143
144
timing_cache_path (str): Path to the timing cache if it exists (or) where it will be saved after compilation
145
+
lazy_engine_init (bool): Defer setting up engines until the compilation of all engines is complete. Can allow larger models with multiple graph breaks to compile but can lead to oversubscription of GPU memory at runtime.
144
146
**kwargs: Any,
145
147
Returns:
146
148
torch.fx.GraphModule: Compiled FX Module, when run it will execute via TensorRT
"""Takes a name, target device, serialized TensorRT engine, and binding names / order and constructs
43
+
a PyTorch ``torch.nn.Module`` around it. Uses TensorRT Python APIs to run the engine
44
+
45
+
Arguments:
46
+
serialized_engine (bytes): Serialized TensorRT engine in the form of a bytearray
47
+
input_binding_names (List[str]): List of input TensorRT engine binding names in the order they would be passed to the TRT modules
48
+
output_binding_names (List[str]): List of output TensorRT engine binding names in the order they should be returned
49
+
50
+
Keyword Arguments:
51
+
name (str): Name for module
52
+
settings (torch_tensorrt.dynamo.CompilationSettings): Settings used to compile engine, assumes engine was built with default compilation settings if object not passed
0 commit comments