Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Web] BiRefNet_T not working on webgpu #21968

Open
guschmue opened this issue Sep 3, 2024 · 12 comments
Open

[Web] BiRefNet_T not working on webgpu #21968

guschmue opened this issue Sep 3, 2024 · 12 comments
Assignees
Labels
ep:WebGPU ort-web webgpu provider platform:web issues related to ONNX Runtime web; typically submitted using template

Comments

@guschmue
Copy link
Contributor

guschmue commented Sep 3, 2024

Describe the issue

https://huggingface.co/onnx-community/BiRefNet_T does not work on webgpu
image

To reproduce

See https://huggingface.co/onnx-community/BiRefNet_T

Urgency

No response

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1.19

Execution Provider

'webgpu' (WebGPU)

@guschmue guschmue added platform:web issues related to ONNX Runtime web; typically submitted using template ep:WebGPU ort-web webgpu provider labels Sep 3, 2024
@prathikr prathikr self-assigned this Sep 3, 2024
@guschmue
Copy link
Contributor Author

guschmue commented Sep 13, 2024

I looked briefly at this: the model takes a lot of memory for activations and will not work with wasm32.
If I run it on webgpu I see it using close to 7GB gpu memory. In theory if your gpu has that kind of memory it should work BUT the model uses GatherND and ScatterND which we have not implemented for webgpu and falls back to wasm just to run out of memory on the wasm end.
We could implement GatherND and ScatterND what should make it work on high end gpu's but most won't have those high end gpu's.

@tidus2102
Copy link

hi @guschmue, thanks for your support. It's great if you can make it works on webgpu (at least with high-end gpus). About the memory, you means we can only run this model with wasm64 right?

@fs-eire
Copy link
Contributor

fs-eire commented Oct 8, 2024

If we can make GatherND and ScatterND working on WebGPU, by using external data side-loading it is possible to make it work in wasm32. And there is an ongoing effort to support wasm64 build.

@xenova
Copy link
Contributor

xenova commented Oct 9, 2024

Possibly related: https://huggingface.co/onnx-community/DepthPro-ONNX also throws an error for me:

An uncaught WebGPU validation error was raised: The number of storage buffers (36) in the Compute stage exceeds the maximum per-stage limit (8).

  • While validating binding counts
  • While validating [BindGroupLayoutDescriptor]
  • While calling [Device].CreateComputePipeline([ComputePipelineDescriptor "Concat"]).

Another error:

Uncaught (in promise) Error: [WebGPU] Kernel "[Concat] /encoder/Concat_23" failed. Error: non concat dimensions must match


The model works correctly in Node.js (CPU)

@xenova
Copy link
Contributor

xenova commented Nov 14, 2024

https://huggingface.co/briaai/RMBG-2.0 is a new birefnet-based model for state-of-the-art background removal. Would be useful to test too!

@tidus2102
Copy link

Hi, just a kindly check if there have been any updates or progress regarding this issue? Thank you

@fs-eire
Copy link
Contributor

fs-eire commented Dec 2, 2024

The The number of storage buffers exceeds the maximum per-stage limit. is a known issue for Concat. this issue is being tracked.

@tidus2102
Copy link

If we can make GatherND and ScatterND working on WebGPU, by using external data side-loading it is possible to make it work in wasm32. And there is an ongoing effort to support wasm64 build.

hi, how about this?

@fs-eire
Copy link
Contributor

fs-eire commented Dec 4, 2024

#22847 and #22755 introduced implementation of GatherND and ScatterND. Please allow one day or 2 for the pipeline to publish a nightly package.

@tidus2102
Copy link

tidus2102 commented Dec 27, 2024

Hi, I've just tested the latest ORT nightly dev build but still got the memory error when inference BiRefNet_lite onnx model on Chrome 131.0.6778.205 arm64 - macOS 15.2 (24C101).
Image
Here is the sample code.

Please help to check again. Thank you!

@xenova
Copy link
Contributor

xenova commented Mar 2, 2025

the model uses GatherND and ScatterND which we have not implemented for webgpu and falls back to wasm just to run out of memory on the wasm end.
We could implement GatherND and ScatterND what should make it work on high end gpu's but most won't have those high end gpu's.

Now that we have GatherND and ScatterND ops implemented, should this be working?

@xenova
Copy link
Contributor

xenova commented Mar 7, 2025

I've released a few more birefnet models we can use for testing: https://huggingface.co/models?library=transformers.js&other=birefnet. Unfortunately, it's still an issue today 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:WebGPU ort-web webgpu provider platform:web issues related to ONNX Runtime web; typically submitted using template
Projects
None yet
Development

No branches or pull requests

5 participants