Fix Qwen3 recipe and update autoquant example cmd #749

meenchen · 2026-01-08T20:20:47Z

What does this PR do?

Type of change: Bug fix

Overview: ?

Usage

# Add a code snippet demonstrating how to use this

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Signed-off-by: weimingc <[email protected]>

copy-pr-bot · 2026-01-08T20:20:51Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

codecov · 2026-01-08T20:31:29Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.66%. Comparing base (68d604d) to head (6935660).
⚠️ Report is 20 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #749   +/-   ##
=======================================
  Coverage   74.65%   74.66%           
=======================================
  Files         192      192           
  Lines       18969    18975    +6     
=======================================
+ Hits        14162    14167    +5     
- Misses       4807     4808    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

realAsma · 2026-01-14T18:30:23Z

examples/llm_ptq/example_utils.py

+    if model_type in ["qwen3moe", "qwen3next"] and qformat == "nvfp4":
+        # Disable the attention projection layers to retain accuracy
+        quant_cfg["quant_cfg"]["model*.*attn*in_proj*"] = {"enable": False}
+        quant_cfg["quant_cfg"]["model*.*attn*q_proj*"] = {"enable": False}
+        quant_cfg["quant_cfg"]["model*.*attn*k_proj*"] = {"enable": False}
+        quant_cfg["quant_cfg"]["model*.*attn*v_proj*"] = {"enable": False}


Is there an option to skip this setting? We are hardcoding skipping of attention here.

Cc @shengliangxu - config system and model based config examples could be helpful to improve the overall experience.

Sure. Right now feel free to add an additional flag for auto quant.

We can refactor this part once the config system is ready.

## What does this PR do? **Type of change:** Bug fix  **Overview:** ? ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes/No  - **Did you write any new necessary tests?**: Yes/No - **Did you add or update any necessary documentation?**: Yes/No - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: Yes/No  ## Additional Information  Signed-off-by: weimingc <[email protected]>

minor fix, update autoquant example cmd

6935660

Signed-off-by: weimingc <[email protected]>

meenchen self-assigned this Jan 14, 2026

meenchen marked this pull request as ready for review January 14, 2026 17:51

meenchen requested a review from a team as a code owner January 14, 2026 17:51

meenchen requested review from Edwardf0t1, cjluo-nv and realAsma January 14, 2026 17:51

realAsma reviewed Jan 14, 2026

View reviewed changes

realAsma requested a review from shengliangxu January 14, 2026 18:30

cjluo-nv approved these changes Jan 14, 2026

View reviewed changes

meenchen merged commit 6038451 into main Jan 14, 2026
35 checks passed

meenchen deleted the weimingc/fix_ptq branch January 14, 2026 19:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Qwen3 recipe and update autoquant example cmd #749

Fix Qwen3 recipe and update autoquant example cmd #749

meenchen commented Jan 8, 2026

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 •

edited

Loading

Uh oh!

realAsma Jan 14, 2026 •

edited

Loading

Uh oh!

cjluo-nv Jan 14, 2026

Uh oh!

meenchen Jan 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix Qwen3 recipe and update autoquant example cmd #749

Fix Qwen3 recipe and update autoquant example cmd #749

Conversation

meenchen commented Jan 8, 2026

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Jan 8, 2026

Uh oh!

codecov bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

realAsma Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cjluo-nv Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

meenchen Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Jan 8, 2026 •

edited

Loading

realAsma Jan 14, 2026 •

edited

Loading