-
Notifications
You must be signed in to change notification settings - Fork 675
NXP backend: Per-channel quantization of convolution layer #14061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXP backend: Per-channel quantization of convolution layer #14061
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14061
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Cancelled JobAs of commit 1a3d9e5 with merge base 7e228ee ( NEW FAILURE - The following job has failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Based on a conversation from this PR, I added a feature that uses the implementation. The old PR can be declined. |
@pytorchbot label "release notes: nxp" |
) | ||
assert nodes[10].name == "aten_convolution_default" | ||
|
||
@classmethod |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I would move this method up to be 1st, to aling with other tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty chaotic across our unittest tests. I will make it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done ✅
"QDQDequantizeConverter", | ||
"QDQPerTensorDequantizeConverter", | ||
"QDQPerChannelDequantizeConverter", | ||
"QDQQuantizeConverter", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only QDQDequantizer needs to be updated, not QDQQuantizeConverter too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, as there are no changes to QDQQuantizeConverter
. Per channel quantization scheme is used only for weights and biases, which are inputs - dequantize nodes.
list[tuple[fx.Node, NodeArgsIdx]] | ||
| list[tuple[fx.Node, NodeArgsIdx, DerivedQuantizationSpec]] | ||
), | ||
spec: QuantizationSpec | None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, why switch from Optional to | None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a part of move to Python 3.10 type hints and leaving imports from Typing.
|
||
# pyre-ignore[6]: incompatible parameter type | ||
annotate_inputs(anchors.inputs, input_act_qspec) | ||
annotate_weights_or_biases(anchors.weights, weight_qspec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this function no longer used at all now and can be removed entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is replaced by annotate_inputs()
.
02d89fb
to
16df359
Compare
16df359
to
1a3d9e5
Compare
The error in Samsung's tests seems to be caused by some problem with downloading resources, and is most likely unrelated. Update: I double-checked it and this error is also present in other PRs... |
Summary
Adds per-channel quantization for convolution layer and introduces NodeArgsIdx class to Neutron Quantizer for better handling of indexes to quantized node's args list.
NodeArgsIdx allows selection of nested objects, e.g. an object in a list in node's args list. It also simplifies NeutronAtenQuantizer annotation process by using annotate_inputs() for inputs, weights and biases.
Test plan
The implementation should be covered by either existing or newly added unit tests.
cc @digantdesai @JakeStevens @robert-kalmar, @skywall, @roman-janik-nxp