[webgpu] Support `Split-K` in more situations #26806

Jiawei-Shao · 2025-12-16T05:44:17Z

Description

This patch supports more dim_inner (up to 4096) for Split-K to optimize more models.
This patch also enables Split-K on gen-12lp.

Motivation and Context

With this PR we can achieve about 30% improvement on jina-clip-v1-text-fp16 and 20% improvement on jina-embeddings-v2-base-code-fp16 on Lunar Lake iGPUs.

This patch adds the support of more `dim_inner` (up to 3072) for Split-K to optimize more models.

Jiawei-Shao · 2025-12-16T06:37:06Z

@qjia7 @jchen10 PTAL, thanks!

onnxruntime/core/providers/webgpu/webgpu_utils.cc

guschmue · 2025-12-17T15:35:08Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2025-12-17T15:35:30Z

Azure Pipelines successfully started running 4 pipeline(s).

guschmue · 2026-01-06T16:09:14Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-01-06T16:09:35Z

Azure Pipelines successfully started running 4 pipeline(s).

guschmue · 2026-01-06T17:43:18Z

CI is nagging on linux and macos:

Jiawei-Shao · 2026-01-07T03:02:18Z

CI is nagging on linux and macos:

Hi @guschmue, sorry about that! I've fixed this failure, PTAL, thanks!

guschmue · 2026-01-07T16:22:06Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2026-01-07T16:22:25Z

Azure Pipelines successfully started running 4 pipeline(s).

[webgpu] Support dim_inner <= 3072 for Split-K

daccef8

This patch adds the support of more `dim_inner` (up to 3072) for Split-K to optimize more models.

Jiawei-Shao marked this pull request as draft December 16, 2025 05:44

Jiawei-Shao marked this pull request as ready for review December 16, 2025 06:36

Support xe-3lpg

7c88d23

Jiawei-Shao changed the title ~~[webgpu] Support dim_inner <= 3072 for Split-K~~ [webgpu] Support Split-K in more situations Dec 16, 2025

Update comment

08c11b0

qjia7 reviewed Dec 17, 2025

View reviewed changes

onnxruntime/core/providers/webgpu/webgpu_utils.cc Outdated Show resolved Hide resolved

onnxruntime/core/providers/webgpu/webgpu_utils.cc Outdated Show resolved Hide resolved

guschmue added the ep:WebGPU ort-web webgpu provider label Dec 18, 2025

qjia7 requested review from fs-eire and guschmue December 23, 2025 07:02

Jiawei-Shao marked this pull request as draft December 29, 2025 02:10

Jiawei-Shao added 5 commits December 29, 2025 11:23

Support Gen12_LP

0367365

Fix thresholds on xe-lpg and remove xe-3lpg

466645e

Merge branch 'main' into more-dim-inner-for-splitk

2ff21dd

Use the values on gen-12lp 32EU as default ones on Intel

19fe69b

Merge branch 'main' into more-dim-inner-for-splitk

e5dee6d

Jiawei-Shao marked this pull request as ready for review January 4, 2026 08:30

Jiawei-Shao requested a review from qjia7 January 5, 2026 05:45

qjia7 previously approved these changes Jan 5, 2026

View reviewed changes

guschmue previously approved these changes Jan 6, 2026

View reviewed changes

guschmue enabled auto-merge (squash) January 6, 2026 16:08

Jiawei-Shao added 2 commits January 7, 2026 09:07

Fix compilation error on Linux and MacOS

c6e6fd0

Merge branch 'main' into more-dim-inner-for-splitk

8da3a33

auto-merge was automatically disabled January 7, 2026 01:09
Head branch was pushed to by a user without write access

Jiawei-Shao dismissed stale reviews from guschmue and qjia7 via 8da3a33 January 7, 2026 01:09

Jiawei-Shao requested a review from guschmue January 7, 2026 03:01

guschmue approved these changes Jan 7, 2026

View reviewed changes

guschmue merged commit 4a858a8 into microsoft:main Jan 7, 2026
105 of 178 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[webgpu] Support `Split-K` in more situations #26806

[webgpu] Support `Split-K` in more situations #26806

Jiawei-Shao commented Dec 16, 2025 •

edited

Loading

Uh oh!

Jiawei-Shao commented Dec 16, 2025

Uh oh!

Uh oh!

Uh oh!

guschmue commented Dec 17, 2025

Uh oh!

azure-pipelines bot commented Dec 17, 2025

Uh oh!

guschmue commented Jan 6, 2026

Uh oh!

azure-pipelines bot commented Jan 6, 2026

Uh oh!

guschmue commented Jan 6, 2026

Uh oh!

Jiawei-Shao commented Jan 7, 2026

Uh oh!

guschmue commented Jan 7, 2026

Uh oh!

azure-pipelines bot commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[webgpu] Support Split-K in more situations #26806

[webgpu] Support Split-K in more situations #26806

Conversation

Jiawei-Shao commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

Jiawei-Shao commented Dec 16, 2025

Uh oh!

Uh oh!

Uh oh!

guschmue commented Dec 17, 2025

Uh oh!

azure-pipelines bot commented Dec 17, 2025

Uh oh!

guschmue commented Jan 6, 2026

Uh oh!

azure-pipelines bot commented Jan 6, 2026

Uh oh!

guschmue commented Jan 6, 2026

Uh oh!

Jiawei-Shao commented Jan 7, 2026

Uh oh!

guschmue commented Jan 7, 2026

Uh oh!

azure-pipelines bot commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[webgpu] Support `Split-K` in more situations #26806

[webgpu] Support `Split-K` in more situations #26806

Jiawei-Shao commented Dec 16, 2025 •

edited

Loading