Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js/webgpu] Optimize InstanceNorm in some shapes #22637

Merged
merged 1 commit into from
Oct 30, 2024

Conversation

qjia7
Copy link
Contributor

@qjia7 qjia7 commented Oct 29, 2024

BUG #22031

Optimize below two situations:

  1. Increase workgroupSize if only one workgroup is dispatched.
  2. Avoid transpose if not necessary.

The overall time of demucs model becomes 106.36 ms from 154.60 ms on my dGPUs with this PR and PR #22577

BUG microsoft#22031

Optimize below two situations:
1. Increase workgroupSize if only one workgroup is dispatched.
2. Avoid transpose if not necessary.

The overall time of demucs model becomes 106.36 ms from 154.60 ms on my
dGPUs with this PR and PR microsoft#22577
@qjia7
Copy link
Contributor Author

qjia7 commented Oct 29, 2024

@guschmue @fs-eire Please take a look, thanks.

@fs-eire
Copy link
Contributor

fs-eire commented Oct 29, 2024

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Oct 29, 2024

/azp run Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Oct 29, 2024

/azp run Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline,CoreML CI Pipeline,Linux DNNL CI Pipeline,Linux MIGraphX CI Pipeline,Linux ROCm CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

2 similar comments
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Oct 30, 2024
@guschmue guschmue merged commit 04e696d into microsoft:main Oct 30, 2024
60 checks passed
@qjia7 qjia7 deleted the opt-instance-norm branch October 30, 2024 02:32
ishwar-raut1 pushed a commit to ishwar-raut1/onnxruntime that referenced this pull request Nov 19, 2024
BUG microsoft#22031

Optimize below two situations:
1. Increase workgroupSize if only one workgroup is dispatched.
2. Avoid transpose if not necessary.

The overall time of demucs model becomes 106.36 ms from 154.60 ms on my
dGPUs with this PR and PR microsoft#22577
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants