-
Notifications
You must be signed in to change notification settings - Fork 763
NXP backend: Add support for optimizing Conv+BN during QAT #16246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
NXP backend: Add support for optimizing Conv+BN during QAT #16246
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16246
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 3 Unrelated FailuresAs of commit 8503477 with merge base b081123 ( NEW FAILURE - The following job has failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enables optimization of Conv+BatchNorm patterns during Quantization-Aware Training (QAT) in the NXP backend. The implementation leverages TorchAO's native Conv+BN fusion by conditionally skipping output quantization on Conv operations when followed by BatchNorm in QAT mode. The key mechanism involves disabling the ExecutorCH's FuseBatchNormWithConvPass in QAT mode and introducing a new BatchNormPattern to preserve quantization for subsequent layers.
Key Changes:
- Added
--use_qatcommand-line argument to enable QAT mode during model compilation - Implemented conditional logic to skip Conv output quantization when followed by BatchNorm in QAT mode
- Added BatchNormPattern to quantize BatchNorm outputs while leaving inputs unquantized (for Conv+BN fusion)
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/nxp/aot_neutron_compile.py | Adds --use_qat CLI argument and passes it to NeutronQuantizer and calibrate_and_quantize |
| backends/nxp/quantizer/neutron_quantizer.py | Registers BatchNormPattern and filters out FuseBatchNormWithConvPass when in QAT mode |
| backends/nxp/quantizer/patterns.py | Adds BatchNormPattern class and conditional output quantization logic in Conv patterns for QAT Conv+BN fusion |
| backends/nxp/tests/models.py | Adds ConvBNModule test model supporting Conv1d/2d/Transpose + BatchNorm combinations |
| backends/nxp/tests/test_quantizer.py | Adds parameterized test for Conv+BN fusion in QAT mode across different conv types, bias, and affine configurations |
Comments suppressed due to low confidence (3)
backends/nxp/quantizer/patterns.py:444
- The
output_specsvariable is set conditionally in the QAT block (lines 432-438) but is never used in the return statement on line 444, which hardcodes[(conv_node,)]instead. This means the Conv+BatchNorm fusion logic for QAT mode has no effect for Conv1dPattern and ConvTranspose1dPattern which inherit from this class. The variable should be initialized before the conditional block asoutput_specs = [(conv_node,)]and the return statement should useoutput=output_specsto match the pattern used in Conv2dPattern.
if self.is_qat:
conv_users = conv_node.users
possibly_bn = list(conv_users.keys())[0] if len(conv_users) == 1 else None
if possibly_bn and _is_batch_norm(possibly_bn):
output_specs = []
else:
output_specs = [(conv_node,)]
return PartitionAnchors(
inputs=[(conv_node, NodeArgsIdx(0))],
weights=[(conv_node, NodeArgsIdx(1), weight_quantization_spec)],
biases=bias,
output=[(conv_node,)],
backends/nxp/quantizer/patterns.py:436
- Variable output_specs is not used.
output_specs = []
backends/nxp/quantizer/patterns.py:438
- Variable output_specs is not used.
output_specs = [(conv_node,)]
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
6bb4c2d to
33cb6e4
Compare
33cb6e4 to
0675a79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0675a79 to
2e873bc
Compare
2e873bc to
8503477
Compare
| action="store_true", | ||
| required=False, | ||
| default=False, | ||
| help="Use QAT mode for quantization (does not include QAT training)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the quantization aware training is not possible using this module, why include it? Just to show how it can be triggered? If so, perhaps a separate example module, or even just a README might be better in my opinion.
|
The failing unittest tests are related to XNNPack backend. Although I have not seen such error in other PRs, implementation in this PR should not interfere with it. |
Summary
Enables optimization of Conv+BatchNorm during QAT. This involves:
FuseBatchNormWithConvPasswhen in QAT mode)Test plan
Test cases for Conv+BatchNorm fusion and quantization were added.
cc @robert-kalmar @JakeStevens @digantdesai