feat: hybrid modeling #125

tianhaox · 2025-11-24T13:46:37Z

Overview:

rename sol mode of database to silicon mode.
add empirical mode, where the latency = sol / empirical factor. Current emprical factor is a constant number, could be more complicated in future. Add each op should design its own empirical factors.
add hybrid mode, when missing silicon data, in this mode, allow to fallback to empirical mode.

So in future, when you add a new op, you can support running configuration without actual data if you use sol/empirical/hybrid mode. The hybrid mode should give you the best guess of the overal model performance.

This is the agg mode of running qwen3 32b on h200. Pure emprical mode has much smaller diff than sol mode, comparing with silicon mode. Imagine if part of the model is missing, the hybrid mode can give you a very close result unless the missing part is taking the major portion.

Signed-off-by: Tianhao Xu <[email protected]>

copy-pr-bot · 2025-11-24T13:46:41Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Tianhao Xu <[email protected]>

src/aiconfigurator/sdk/perf_database.py

Arsene12358 · 2025-11-28T07:46:36Z

Also, it seems we only enable this via webapp, what about users using the CLI?

tianhaox · 2025-11-28T08:22:31Z

Get the hybrid time, hybrid math and hybrid mem

should be enabled afterwards when we do some refactoring of the config

Signed-off-by: Tianhao Xu <[email protected]>

src/aiconfigurator/sdk/perf_database.py

davilu-nvidia

I'll approve first to unblock merging.

Signed-off-by: Tianhao Xu <[email protected]>

tianhaox · 2025-12-03T11:16:13Z

Also, it seems we only enable this via webapp, what about users using the CLI?

added cli.
aiconfigurator cli default --model QWEN3_480B --total_gpus 32 --system b200_sxm --database_mode HYBRID
this command shall work while silicon mode cannot.

tianhaox added 2 commits November 23, 2025 19:57

call fa3 to do mla instead of flashmla

6dea47e

Signed-off-by: Tianhao Xu <[email protected]>

add emprical and hybrid mode for sdk and webapp

4b5881f

Signed-off-by: Tianhao Xu <[email protected]>

tianhaox requested review from AichenF, Arsene12358, YijiaZhao, ilyasher, jasonqinzhou, saturley-hall, simone-chen and xutizhou as code owners November 24, 2025 13:46

github-actions bot added the feat label Nov 24, 2025

merge main; enable debug log for hybrid fallback

eb7cd7f

Signed-off-by: Tianhao Xu <[email protected]>

Arsene12358 reviewed Nov 28, 2025

View reviewed changes

src/aiconfigurator/sdk/perf_database.py Outdated Show resolved Hide resolved

fix docstring

3528133

Signed-off-by: Tianhao Xu <[email protected]>

Arsene12358 reviewed Nov 28, 2025

View reviewed changes

src/aiconfigurator/sdk/perf_database.py Outdated Show resolved Hide resolved

davilu-nvidia reviewed Dec 1, 2025

View reviewed changes

src/aiconfigurator/sdk/perf_database.py Show resolved Hide resolved

davilu-nvidia approved these changes Dec 3, 2025

View reviewed changes

tianhaox added 4 commits December 3, 2025 18:06

merge main to resolve conflict

d175928

Signed-off-by: Tianhao Xu <[email protected]>

fix webapp db mode visibility; add suggestion when failed

d48dcff

Signed-off-by: Tianhao Xu <[email protected]>

add database mode to cli

fdd1650

Signed-off-by: Tianhao Xu <[email protected]>

add readme for database mode

cf0cf76

Signed-off-by: Tianhao Xu <[email protected]>

tianhaox requested review from Ethan-ES and Harrilee as code owners December 3, 2025 11:07

fix build test

7b8df17

Signed-off-by: Tianhao Xu <[email protected]>

Arsene12358 approved these changes Dec 3, 2025

View reviewed changes

tianhaox merged commit 8c399e5 into ai-dynamo:main Dec 3, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: hybrid modeling #125

feat: hybrid modeling #125

Uh oh!

tianhaox commented Nov 24, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Nov 24, 2025

Uh oh!

Uh oh!

Arsene12358 commented Nov 28, 2025

Uh oh!

tianhaox commented Nov 28, 2025

Uh oh!

Uh oh!

Uh oh!

davilu-nvidia left a comment

Uh oh!

tianhaox commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: hybrid modeling #125

feat: hybrid modeling #125

Uh oh!

Conversation

tianhaox commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Uh oh!

copy-pr-bot bot commented Nov 24, 2025

Uh oh!

Uh oh!

Arsene12358 commented Nov 28, 2025

Uh oh!

tianhaox commented Nov 28, 2025

Uh oh!

Uh oh!

Uh oh!

davilu-nvidia left a comment

Choose a reason for hiding this comment

Uh oh!

tianhaox commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tianhaox commented Nov 24, 2025 •

edited

Loading