Svdquant huggingface checkpoint export support #754

sychen52 · 2026-01-09T05:16:14Z

What does this PR do?

Type of change: new feature

Overview:

Usage

cd ./examples/llm_ptq/
python hf_ptq.py \
    --pyt_ckpt_path Qwen/Qwen3-4B \
    --export_path /home/scratch.shiychen_coreai/quantized_models/Qwen3-4B-svdq \
    --qformat nvfp4_awq_svdquant --kv_cache_qformat none --sparsity_fmt dense --calib_size 8

Testing

exported checkpoint and loaded.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

Signed-off-by: Shiyang Chen <[email protected]>

codecov · 2026-01-09T05:32:39Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.63%. Comparing base (fe52b2a) to head (2141906).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #754      +/-   ##
==========================================
- Coverage   74.68%   74.63%   -0.06%     
==========================================
  Files         192      192              
  Lines       18950    18995      +45     
==========================================
+ Hits        14153    14177      +24     
- Misses       4797     4818      +21

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Shiyang Chen <[email protected]>

jingyu-ml

LGTM overall, including the approach for fusing the QKV and FFN layers. The current resmooth + refusion process means the resulting model is not exactly identical to the original, but this appears to be the only viable option at the moment unless we can fuse these layers during calibration...
Thank you for your work!

jingyu-ml · 2026-01-13T03:41:55Z

modelopt/torch/export/unified_export_hf.py

    QUANTIZATION_NONE,
    QUANTIZATION_NVFP4,
    QUANTIZATION_NVFP4_AWQ,
+    QUANTIZATION_NVFP4_SVDQUANT,


Could you add a check to ensure the model is running on a single GPU and not in a distributed setup? I’m not sure that our current SVDQ implementation works correctly with multiple GPUs. We can remove this check later once we verify that SVDQ calibration functions properly in a multi-GPU setting.

cjluo-nv · 2026-01-13T06:38:06Z

modelopt/torch/quantization/model_calib.py

+def svd(weight, rank):
+    original_device = weight.device
+    original_dtype = weight.dtype
+    weight_f64 = weight.to(dtype=torch.float64, device=original_device)


do we need f64?

sychen52 requested review from a team as code owners January 9, 2026 05:16

sychen52 requested review from cjluo-nv, jingyu-ml and meenchen January 9, 2026 05:16

sychen52 force-pushed the svdquant branch from 8c8e5a0 to a788b53 Compare January 9, 2026 05:20

support nvfp4_svdquant hf export

882d539

Signed-off-by: Shiyang Chen <[email protected]>

sychen52 force-pushed the svdquant branch from a788b53 to 34e75e5 Compare January 9, 2026 05:22

handle q/k/v and gate/up merging for svdquant

2141906

Signed-off-by: Shiyang Chen <[email protected]>

sychen52 force-pushed the svdquant branch from 34e75e5 to 2141906 Compare January 9, 2026 21:52

jingyu-ml approved these changes Jan 13, 2026

View reviewed changes

cjluo-nv reviewed Jan 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Svdquant huggingface checkpoint export support #754

Svdquant huggingface checkpoint export support #754

Uh oh!

sychen52 commented Jan 9, 2026

Uh oh!

codecov bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

jingyu-ml left a comment

Uh oh!

jingyu-ml Jan 13, 2026

Uh oh!

cjluo-nv Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Svdquant huggingface checkpoint export support #754

Are you sure you want to change the base?

Svdquant huggingface checkpoint export support #754

Uh oh!

Conversation

sychen52 commented Jan 9, 2026

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

codecov bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jingyu-ml left a comment

Choose a reason for hiding this comment

Uh oh!

jingyu-ml Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

cjluo-nv Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Jan 9, 2026 •

edited

Loading