-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
I really love the idea behind this project, but I'm struggling to get vision models to work properly. My experiences aren't unique it seems judging from the other issues.
I'll explain all that I did, and try to link to other issues in a hope to consolidate some information. See the questions at the bottom.
Apologies in advance for the length.
What I've tried
For each example, cleared cache between runs using:
(await window.caches.keys()).forEach(async k => k.includes("webllm") && await window.caches.delete(k))1) Basic usage
Following the example outlined in the docs.
Code
const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";
const engine = new MLCEngine({
initProgressCallback: (p) => {
console.log(p);
}
});
await engine.reload(modelName);The cache will populate fine, but then the following error occurs.
Error
ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
01b7e676:0x160422 [FATAL] /Users/cfruan/Documents/tvm/web/../src/runtime/relax_vm/ndarray_cache_support.cc:333: ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
overrideMethod @ hook.js:608
put_char @ @mlc-ai_web-llm.js?v=0336b5a7:4635
write @ @mlc-ai_web-llm.js?v=0336b5a7:4604
write @ @mlc-ai_web-llm.js?v=0336b5a7:5723
doWritev @ @mlc-ai_web-llm.js?v=0336b5a7:6275
_fd_write @ @mlc-ai_web-llm.js?v=0336b5a7:6287
$func1677 @ 01b7e676:0x160422
$func1684 @ 01b7e676:0x16099f
$func1685 @ 01b7e676:0x160a4c
$func1911 @ 01b7e676:0x16de87
$func1781 @ 01b7e676:0x16a360
$func1798 @ 01b7e676:0x16aca1
$func1799 @ 01b7e676:0x16acf3
$_ZN3tvm7runtime6detail12LogFatalImplERKNSt3__212basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEiSA_ @ 01b7e676:0x20a86
$func20 @ 01b7e676:0x204bd
$func459 @ 01b7e676:0x6964b
$func458 @ 01b7e676:0x69269
$func1208 @ 01b7e676:0xe3df7
$TVMFuncCall @ 01b7e676:0x271f6
packedFunc @ @mlc-ai_web-llm.js?v=0336b5a7:8528
getParamsFromCacheByName @ @mlc-ai_web-llm.js?v=0336b5a7:7618
LLMChatPipeline @ @mlc-ai_web-llm.js?v=0336b5a7:13990
(anonymous) @ @mlc-ai_web-llm.js?v=0336b5a7:20167
fulfilled @ @mlc-ai_web-llm.js?v=0336b5a7:2240
index.tsx:98 ExitStatus {name: 'ExitStatus', message: 'Program terminated with exit(1)', status: 1}
overrideMethod @ hook.js:608
#init_model @ index.tsx:9
See also #640 for this same error. The error also occurs on the https://chat.webllm.ai/ site
2) Updating model_lib
The next attempt was to modify the model_lib, by pointing to v0_2_80 instead of v0_2_48 like in the config.
Code
const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";
const engine = new MLCEngine({
appConfig: {
model_list: [{
model: "https://huggingface.co/mlc-ai/Phi-3.5-vision-instruct-q4f16_1-MLC",
model_id: modelName,
model_type: ModelType.VLM,
model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/Phi-3.5-vision-instruct-q4f16_1-ctx4k_cs2k-webgpu.wasm",
}, ],
},
initProgressCallback: (p) => {
console.log(p);
},
});
await engine.reload(modelName);Error
This error seems to be an issue with the wasm file itself:
LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callableSee also #700 for this same error
3) Using Gemma-3
I also tried to use Gemma-3.
Code
const modelName = "gemma-3-1b-it-q4f16_1-MLC";
const engine = new MLCEngine({
appConfig: {
model_list: [{
model: "https://huggingface.co/mlc-ai/gemma-3-1b-it-q4f16_1-MLC",
model_id: modelName,
model_type: ModelType.VLM,
model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/gemma3-1b-it-q4f16_1-ctx4k_cs1k-webgpu.wasm",
}, ],
},
initProgressCallback: (p) => {
console.log(p);
},
});
await engine.reload(modelName);Error
This is the same error as above
LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callable4) Compile models
Per the comment in 700, I had to compile the models, which involved building tvm and mlc-llm from source.
- See the lib file on Github
- See the model on Huggingface
I tried it pointing to my Github file and Huggingface repo:
const modelName = "gemma-3-1b-it-q4f16_1-MLC";
const engine = new MLCEngine({
appConfig: {
model_list: [{
model: "https://huggingface.co/charlesLoder/gemma-3-1b-it-q4f16_1-MLC",
model_id: modelName,
model_type: ModelType.VLM,
model_lib: "https://raw.githubusercontent.com/charlesLoder/mlc-compiled-models/refs/heads/main/libs/gemma-3-1b-it-q4f16_1-webgpu.wasm",
}, ],
},
initProgressCallback: (p) => {
console.log(p);
},
});
await engine.reload(modelName);And got basically the same error as before:
LinkError: WebAssembly.instantiate(): Import #0 "env" "TVMFFIEnvSetStream": function import requires a callableThe only difference being #0 instead of #2.
Questions
- I assumed compiling the models after building
tvmandmlc-llmfrom source would've worked — did I do something wrong in how I compiled them? (I'm way out of my element doing that) - Is the slightly different error significant?
- Is there something else I'm missing?
I'd really love to get these models working in the browser, and this seems to be the tool for it.
Thanks!