Replies: 1 comment 2 replies
-
yes, FX path only supports dynamic batch but not dynamic shape. That is the dilemma of it since trace could not provide this information. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Goal:
We currently do not support models with dynamic shapes which have aten::size in them.
Two relevant issues:
In this model, aten::size is the input to aten::reshape layer. For a dynamic shaped input, aten::size just outputs -1 instead of a shape tensor and hence shape information is not propagated down the network resulting in errors.
Torchvision resnet passes with dynamic shapes. However, there is an alternate implementation by Nvidia (used by the model navigator) that fails.
Torchvision implementation of final avg pool and FC layers: https://github.com/pytorch/vision/blob/main/torchvision/models/resnet.py#L279
Nvidia's implementation: https://github.com/NVIDIA/DeepLearningExamples/blob/c2bb3fea797403612a5ea8e359eb31e7e750374f/PyTorch/Classification/ConvNets/image_classification/models/resnet.py#L312
The latter implementation explicitly uses
size()
call which doesn't work with dynamic shapes.Both of these issues use dynamic batch and not dynamic shapes.
Both these issues fail using Torchscript backend. However, with some tweaks, FX backend compilation is successful.
FX reshape converters have the support for accepting ITensors as a second input.
The main change required for successful compilation of the above models is to add dynamic shape support in FX.
Dynamic shape support in FX:
Currently FX has an option
dynamic_batch=True
. But the support is not fully implemented. https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/fx/input_tensor_spec.py#L21-L29 hardcodes the batch size ranges for dynamic batch inputs.For dynamic batch case, if we modify the https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/fx/input_tensor_spec.py#L21-L29 to provide min, opt, max batches, 🐛 [Bug] Compilation failure for SSD300 model with dynamic batch #1555 runs fine.
For dynamic shapes, verify the toy positional embedding bug which has dynamic shape inputs. Torchscript backend fails but verify it in FX.
Proposal:
Our current Input class uses cpp code via
._C
binding.This is called as
torch_tensorrt.Input()
and passed totorch_tensorrt.compile
On torchscript side, we convert this Input into a
_C.Input()
by calling_to_internal
from thecompile_spec.py
Make
torch_tensorrt.Input
compatible with FX. Use the same constructors and methods as the current Torchscript front end. This allows us to maintain consistency of how dynamic shaped inputs can be provided between the two backends.On the torchscript side, in
compile_spec.py
, we could define a newtorch_tensorrt.ts.Input()
which consumes thistorch_tensorrt.Input
and has conversions to_C.Input
.On the FX side, we already have a
InputTensorSpec
object which defines a input spec for a model. Replace this to usetorch_tensorrt.Input
class directly which should also enable dynamic batch support as well.torch_tensorrt.Input
class should be pure pythonic. If users installfx-only
path, we should ensure_C.Input()
dependency is not involved.Although our current
torch_tensorrt.Input
class is flexible to accept both dynamic shapes and dynamic batch, we should issue a warning to users using FX path that dynamic shapes are not supported.Milestones:
MVP: (S-M)
MVP: (S-M)
Phase 2: (S-M)
Beta Was this translation helpful? Give feedback.
All reactions