Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js/webgpu] Fix issue to run model demucs #22074

Merged
merged 2 commits into from
Sep 17, 2024
Merged

Conversation

gyagp
Copy link

@gyagp gyagp commented Sep 12, 2024

This is to fix issue #22031 to run model demucs.
For conv-transpose, outputPadding.length could be 1, while spatialRank is 2. The fix is to append enough 0s to outputPadding. For conv, the issue is similar. kernelShape.length sometimes could be 1, while inputs[1].dims.length is 4. The fix is also to append enough 0s to kernelShape.

This is to fix issue microsoft#22031 to run model demucs.
For conv-transpose, outputPadding.length could be 1, while spatialRank
is 2. The fix is to append enough 0s to outputPadding.
For conv, the issue is similar. kernelShape.length sometimes could be 1,
while inputs[1].dims.length is 4. The fix is also to append enough 0s to
kernelShape.
@gyagp
Copy link
Author

gyagp commented Sep 12, 2024

@qjia7, @fs-eire @guschmue PTAL
After the fix, model demucs could run, but the performance is still not good. On my machine, wasm is 5936ms, while webgpu is 9557ms.

@guschmue
Copy link
Contributor

/azp run ONNX Runtime Web CI Pipeline,Windows GPU CI Pipeline,Linux Android Emulator QNN CI Pipeline

@guschmue
Copy link
Contributor

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,Windows ARM64 QNN CI Pipeline,Windows CPU CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue
Copy link
Contributor

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,Windows x64 QNN CI Pipeline,Big Models

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@guschmue
Copy link
Contributor

/azp run Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

@gyagp
Copy link
Author

gyagp commented Sep 13, 2024

run web CI

@fs-eire
Copy link
Contributor

fs-eire commented Sep 13, 2024

/azp run ONNX Runtime Web CI Pipeline

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@fs-eire
Copy link
Contributor

fs-eire commented Sep 15, 2024

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CUDA CI Pipeline,Windows GPU DML CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows GPU TensorRT CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline

@fs-eire
Copy link
Contributor

fs-eire commented Sep 15, 2024

/azp run Linux GPU TensorRT CI Pipeline,Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models

@fs-eire
Copy link
Contributor

fs-eire commented Sep 15, 2024

/azp run Linux Android Emulator QNN CI Pipeline,Android CI Pipeline,iOS CI Pipeline,ONNX Runtime React Native CI Pipeline

Copy link

Azure Pipelines could not run because the pipeline triggers exclude this branch/path.

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@loretoparisi
Copy link

@qjia7, @fs-eire @guschmue PTAL After the fix, model demucs could run, but the performance is still not good. On my machine, wasm is 5936ms, while webgpu is 9557ms.

@gyagp do you think we could run any test that can help to investigate this issue? Wondering if this could be related to cpu to gpu copy as it happned in #21618
Thank you for your help!

@fs-eire fs-eire merged commit 2db6b73 into microsoft:main Sep 17, 2024
53 checks passed
@gyagp
Copy link
Author

gyagp commented Sep 19, 2024

There is no copy between CPU and GPU. The profiling data is as below (top 10 time-consuming ops), and MatMul takes half of the time. We will try to optimize it.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants