[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss #98

ngquangtrung57 · 2025-11-27T10:49:11Z

Motivation

Modifications

Commit Message Convention

Please follow our standardized commit message format:

[feat] - New features or functionality
[fix] - Bug fixes
[docs] - Documentation changes only
[style] - Code style changes (formatting, missing semicolons, etc.)
[refactor] - Code refactoring without changing functionality
[perf] - Performance improvements
[test] - Adding or updating tests
[chore] - Maintenance tasks, dependency updates, etc.
[ci] - CI/CD configuration changes

Examples:

[feat] add qwen omni iterable dataset support
[fix] resolve bagel model configuration error
[docs] update training guide with YAML examples

See CONTRIBUTING.md for more details.

CI/CD Checks

Your PR will automatically run the following checks:

Linting: Code formatting with black (line-length=120) and import sorting with isort
Run pre-commit run --all-files locally to verify before pushing

Checklist

Follow commit message convention (see above)
Run pre-commit run --all-files and ensure all checks pass
Format your code with black (line-length=120) and isort
Add unit tests for new functionality
Update documentation as needed, including docstrings or example tutorials
Ensure all CI/CD checks pass

…nd qwen 3 omni moe

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/lmms_engine/models/qwen3_moe/qwen3_moe_liger.py

src/lmms_engine/models/qwen3_omni_moe/qwen3_omni_moe_liger.py

kcz358 · 2025-11-28T02:43:36Z

src/lmms_engine/models/qwen3_moe/qwen3_moe_ops.py

+    output_router_logits = (
+        output_router_logits
+        if output_router_logits is not None
+        else getattr(self.config, "output_router_logits", False)
+    )
+    all_router_logits = () if output_router_logits else None
+


By default the output router logits is always False?

https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507/blob/main/config.json

Do you think we should change it to True for training, since this patch will only use in training mode.

kcz358 · 2025-11-28T02:45:58Z

src/lmms_engine/models/qwen3_omni_moe/qwen3_omni_moe_ops.py

+    output_router_logits = (
+        output_router_logits
+        if output_router_logits is not None
+        else getattr(self.config, "output_router_logits", False)
+    )


Same problem here. Also I checked the config for Qwen3VLMoe. Seems like this problem also exist in qwen3 vl moe? Do you think we need to simply set this to true in training mode.

Sure, will fix this

fix router logits for correct auxiliary loss handling on qwen 3 moe a…

1fc6d76

…nd qwen 3 omni moe

chatgpt-codex-connector bot reviewed Nov 27, 2025

View reviewed changes

src/lmms_engine/models/qwen3_moe/qwen3_moe_liger.py Outdated Show resolved Hide resolved

src/lmms_engine/models/qwen3_omni_moe/qwen3_omni_moe_liger.py Outdated Show resolved Hide resolved

fix shape mismatch with rmpad

eaefeac

kcz358 reviewed Nov 28, 2025

View reviewed changes

ngquangtrung57 added 2 commits November 28, 2025 20:05

change default router logit to True

c80204e

change default to True in ops

db0c0da

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss #98

[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss #98

Uh oh!

ngquangtrung57 commented Nov 27, 2025 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

kcz358 Nov 28, 2025

Uh oh!

kcz358 Nov 28, 2025

Uh oh!

ngquangtrung57 Nov 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss #98

Are you sure you want to change the base?

[fix] Handle router logits in Qwen 3 moe and Qwen 3 omni moe for aux loss #98

Uh oh!

Conversation

ngquangtrung57 commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Commit Message Convention

CI/CD Checks

Checklist

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

kcz358 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

kcz358 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

ngquangtrung57 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ngquangtrung57 commented Nov 27, 2025 •

edited

Loading