Replies: 2 comments 2 replies
-
@fs-eire Can you help here please? 🙏 |
Beta Was this translation helpful? Give feedback.
-
Thank you for using onnxruntime-web. The questions are really good and this is going to be a long answer.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Description:
I am working with ONNX Runtime using the WebGPU backend, which requires importing and running a
.wasm
file. While optimizing the build, I observed the following:./build.sh --config MinSizeRel --build_wasm --skip_tests --disable_wasm_exception_catching --use_jsep --enable_wasm_simd --enable_wasm_threads --include_ops_by_config $ORT_MODELS_CONFIG --enable_reduced_operator_type_support --allow_running_as_root --parallel --target onnxruntime_webassembly
--minimal_build extended
reduces it further to ~3 MB, but this causes some operations to fall back to the CPU (observed when using.ort
models, whereas.onnx
models run fully on GPU).Key Questions:
.wasm
file. Could you explain the key architectural differences between WebGPU and WebGL in ONNX Runtime that necessitate a WASM dependency for WebGPU?Additional Context:
.ort
format appears to reintroduce CPU dependencies in minimal builds, whereas.onnx
avoids this. Is this expected behavior? In my experience, I have seen that some operations have started to use WASM cores instead of WebGPU. I will bring examples of such behavior later.Request:
A detailed explanation of:
Thank you for your insights!
Beta Was this translation helpful? Give feedback.
All reactions