WNA16 does not apply optimized RTN for moe layers by default #1245

wenhuach21 · 2026-01-08T09:21:24Z

No description provided.

for more information, see https://pre-commit.ci

Signed-off-by: Zhang, Weiwei1 <[email protected]>

for more information, see https://pre-commit.ci

…ix_0108 # Conflicts: # auto_round/compressors/base.py

for more information, see https://pre-commit.ci

…ix_0108

Copilot

Pull request overview

This PR addresses an issue where optimized RTN (Round-to-Nearest) quantization was not properly disabled by default for MoE (Mixture of Experts) layers in WNA16 configurations. The change introduces MoE model detection and automatically disables optimized RTN for expert layers to improve efficiency, while allowing users to override this behavior with --enable_opt_rtn.

Key Changes:

Added is_moe_model() utility function to detect MoE models by examining config keys and module names
Changed default value of disable_opt_rtn from True to None to enable automatic optimization detection
Implemented logic to automatically disable optimized RTN for MoE expert layers unless explicitly enabled by the user

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
auto_round/utils/model.py	Adds `is_moe_model()` function to detect MoE models through config inspection and module name checking
auto_round/compressors/config.py	Changes `disable_opt_rtn` default from `True` to `None` to allow automatic optimization
auto_round/compressors/base.py	Implements MoE detection and automatic optimized RTN disabling for expert layers, with improved logging and user override support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

auto_round/compressors/base.py

yiliu30

Others LGTM.

auto_round/utils/model.py

for more information, see https://pre-commit.ci

wenhuach21 and others added 2 commits January 8, 2026 17:20

update

c13805f

[pre-commit.ci] auto fixes from pre-commit.com hooks

33b23d8

for more information, see https://pre-commit.ci

wenhuach21 marked this pull request as draft January 8, 2026 09:22

wenhuach21 changed the title ~~update~~ reproduce bug Jan 8, 2026

wenhuach21 marked this pull request as ready for review January 8, 2026 09:25

WeiweiZhang1 and others added 9 commits January 8, 2026 04:59

update fix

fce2600

Signed-off-by: Zhang, Weiwei1 <[email protected]>

revert

ae46a3d

Merge branch 'main' into fix_0108

9d6a864

[pre-commit.ci] auto fixes from pre-commit.com hooks

b166138

for more information, see https://pre-commit.ci

fix bug

e49d382

update

a3cc6d0

Merge branch 'fix_0108' of https://github.com/intel/auto-round into f…

f9ec5f8

…ix_0108 # Conflicts: # auto_round/compressors/base.py

fix

f59c6d1

[pre-commit.ci] auto fixes from pre-commit.com hooks

7d2a0d8

for more information, see https://pre-commit.ci

wenhuach21 changed the title ~~reproduce bug~~ WNA16 does not apply optimized RTN for moe layers by default Jan 9, 2026

wenhuach21 and others added 2 commits January 9, 2026 10:44

refine log info

7a3d77a

[pre-commit.ci] auto fixes from pre-commit.com hooks

bcf4121

for more information, see https://pre-commit.ci

wenhuach21 mentioned this pull request Jan 9, 2026

opt rtn/low_cpu_mem_usage is too slow for moe models #1150

Closed

wenhuach21 and others added 5 commits January 9, 2026 10:50

refine log info

a25272d

[pre-commit.ci] auto fixes from pre-commit.com hooks

190bb4c

for more information, see https://pre-commit.ci

refine

716e19a

refine

b361258

Merge branch 'fix_0108' of https://github.com/intel/auto-round into f…

8c7381b

…ix_0108

wenhuach21 requested a review from Copilot January 9, 2026 02:54

Copilot AI reviewed Jan 9, 2026

View reviewed changes

auto_round/compressors/base.py Show resolved Hide resolved

auto_round/compressors/base.py Show resolved Hide resolved

wenhuach21 requested review from WeiweiZhang1, n1ck-guo and yiliu30 January 9, 2026 02:55

yiliu30 approved these changes Jan 9, 2026

View reviewed changes

auto_round/utils/model.py Show resolved Hide resolved

fix

d2ad46b

wenhuach21 and others added 3 commits January 9, 2026 13:08

fix

244f4d8

Merge branch 'main' into fix_0108

f506043

[pre-commit.ci] auto fixes from pre-commit.com hooks

e7ed1db

for more information, see https://pre-commit.ci

wenhuach21 merged commit 9588bf9 into main Jan 9, 2026
28 checks passed

wenhuach21 deleted the fix_0108 branch January 9, 2026 05:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WNA16 does not apply optimized RTN for moe layers by default #1245

WNA16 does not apply optimized RTN for moe layers by default #1245

Uh oh!

wenhuach21 commented Jan 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

yiliu30 left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

WNA16 does not apply optimized RTN for moe layers by default #1245

WNA16 does not apply optimized RTN for moe layers by default #1245

Uh oh!

Conversation

wenhuach21 commented Jan 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

yiliu30 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants