Skip to content

Conversation

@StrycekSimon
Copy link
Collaborator

@StrycekSimon StrycekSimon commented Dec 15, 2025

Summary

Enables optimization of Conv+BatchNorm during QAT. This involves:

  • Enabling TorchAO native batch norm and conv fusing (by disabling our FuseBatchNormWithConvPass when in QAT mode)
  • Removing output quantization of convolution (for native conv+bn fusing to properly match the pattern to be replaced)
  • Adding BatchNorm quantization output pattern implementation

Test plan

Test cases for Conv+BatchNorm fusion and quantization were added.

cc @robert-kalmar @JakeStevens @digantdesai

Copilot AI review requested due to automatic review settings December 15, 2025 11:44
@pytorch-bot
Copy link

pytorch-bot bot commented Dec 15, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16246

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 3 Unrelated Failures

As of commit 8503477 with merge base b081123 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 15, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables optimization of Conv+BatchNorm patterns during Quantization-Aware Training (QAT) in the NXP backend. The implementation leverages TorchAO's native Conv+BN fusion by conditionally skipping output quantization on Conv operations when followed by BatchNorm in QAT mode. The key mechanism involves disabling the ExecutorCH's FuseBatchNormWithConvPass in QAT mode and introducing a new BatchNormPattern to preserve quantization for subsequent layers.

Key Changes:

  • Added --use_qat command-line argument to enable QAT mode during model compilation
  • Implemented conditional logic to skip Conv output quantization when followed by BatchNorm in QAT mode
  • Added BatchNormPattern to quantize BatchNorm outputs while leaving inputs unquantized (for Conv+BN fusion)

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
examples/nxp/aot_neutron_compile.py Adds --use_qat CLI argument and passes it to NeutronQuantizer and calibrate_and_quantize
backends/nxp/quantizer/neutron_quantizer.py Registers BatchNormPattern and filters out FuseBatchNormWithConvPass when in QAT mode
backends/nxp/quantizer/patterns.py Adds BatchNormPattern class and conditional output quantization logic in Conv patterns for QAT Conv+BN fusion
backends/nxp/tests/models.py Adds ConvBNModule test model supporting Conv1d/2d/Transpose + BatchNorm combinations
backends/nxp/tests/test_quantizer.py Adds parameterized test for Conv+BN fusion in QAT mode across different conv types, bias, and affine configurations
Comments suppressed due to low confidence (3)

backends/nxp/quantizer/patterns.py:444

  • The output_specs variable is set conditionally in the QAT block (lines 432-438) but is never used in the return statement on line 444, which hardcodes [(conv_node,)] instead. This means the Conv+BatchNorm fusion logic for QAT mode has no effect for Conv1dPattern and ConvTranspose1dPattern which inherit from this class. The variable should be initialized before the conditional block as output_specs = [(conv_node,)] and the return statement should use output=output_specs to match the pattern used in Conv2dPattern.
        if self.is_qat:
            conv_users = conv_node.users
            possibly_bn = list(conv_users.keys())[0] if len(conv_users) == 1 else None
            if possibly_bn and _is_batch_norm(possibly_bn):
                output_specs = []
            else:
                output_specs = [(conv_node,)]

        return PartitionAnchors(
            inputs=[(conv_node, NodeArgsIdx(0))],
            weights=[(conv_node, NodeArgsIdx(1), weight_quantization_spec)],
            biases=bias,
            output=[(conv_node,)],

backends/nxp/quantizer/patterns.py:436

  • Variable output_specs is not used.
                output_specs = []

backends/nxp/quantizer/patterns.py:438

  • Variable output_specs is not used.
                output_specs = [(conv_node,)]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@StrycekSimon StrycekSimon marked this pull request as draft December 15, 2025 12:35
@StrycekSimon StrycekSimon force-pushed the EIEX-650-fix-native-conv-batchnorm-fusing-leaving-artefacts branch from 6bb4c2d to 33cb6e4 Compare December 15, 2025 13:31
@StrycekSimon StrycekSimon requested a review from Copilot December 15, 2025 13:43
@StrycekSimon StrycekSimon force-pushed the EIEX-650-fix-native-conv-batchnorm-fusing-leaving-artefacts branch from 33cb6e4 to 0675a79 Compare December 15, 2025 13:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@StrycekSimon StrycekSimon force-pushed the EIEX-650-fix-native-conv-batchnorm-fusing-leaving-artefacts branch from 0675a79 to 2e873bc Compare December 15, 2025 15:35
@StrycekSimon StrycekSimon force-pushed the EIEX-650-fix-native-conv-batchnorm-fusing-leaving-artefacts branch from 2e873bc to 8503477 Compare December 15, 2025 16:17
@robert-kalmar robert-kalmar added module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate labels Dec 16, 2025
@robert-kalmar robert-kalmar marked this pull request as ready for review December 17, 2025 07:52
action="store_true",
required=False,
default=False,
help="Use QAT mode for quantization (does not include QAT training)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the quantization aware training is not possible using this module, why include it? Just to show how it can be triggered? If so, perhaps a separate example module, or even just a README might be better in my opinion.

@StrycekSimon
Copy link
Collaborator Author

The failing unittest tests are related to XNNPack backend. Although I have not seen such error in other PRs, implementation in this PR should not interfere with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: nxp Issues related to NXP Neutron NPU delegation and code under backends/nxp/ release notes: nxp Changes to the NXP Neutron backend delegate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants