Skip to content

Getting vision models to work — Phi 3.5, Gemma 3 #727

@charlesLoder

Description

@charlesLoder

I really love the idea behind this project, but I'm struggling to get vision models to work properly. My experiences aren't unique it seems judging from the other issues.

I'll explain all that I did, and try to link to other issues in a hope to consolidate some information. See the questions at the bottom.

Apologies in advance for the length.

What I've tried

For each example, cleared cache between runs using:

(await window.caches.keys()).forEach(async k => k.includes("webllm") && await window.caches.delete(k))

1) Basic usage

Following the example outlined in the docs.

Code

const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";
const engine = new MLCEngine({
  initProgressCallback: (p) => {
    console.log(p);
  }
});

await engine.reload(modelName);

The cache will populate fine, but then the following error occurs.

Error

ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
01b7e676:0x160422 [FATAL] /Users/cfruan/Documents/tvm/web/../src/runtime/relax_vm/ndarray_cache_support.cc:333: ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
overrideMethod @ hook.js:608
put_char @ @mlc-ai_web-llm.js?v=0336b5a7:4635
write @ @mlc-ai_web-llm.js?v=0336b5a7:4604
write @ @mlc-ai_web-llm.js?v=0336b5a7:5723
doWritev @ @mlc-ai_web-llm.js?v=0336b5a7:6275
_fd_write @ @mlc-ai_web-llm.js?v=0336b5a7:6287
$func1677 @ 01b7e676:0x160422
$func1684 @ 01b7e676:0x16099f
$func1685 @ 01b7e676:0x160a4c
$func1911 @ 01b7e676:0x16de87
$func1781 @ 01b7e676:0x16a360
$func1798 @ 01b7e676:0x16aca1
$func1799 @ 01b7e676:0x16acf3
$_ZN3tvm7runtime6detail12LogFatalImplERKNSt3__212basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEiSA_ @ 01b7e676:0x20a86
$func20 @ 01b7e676:0x204bd
$func459 @ 01b7e676:0x6964b
$func458 @ 01b7e676:0x69269
$func1208 @ 01b7e676:0xe3df7
$TVMFuncCall @ 01b7e676:0x271f6
packedFunc @ @mlc-ai_web-llm.js?v=0336b5a7:8528
getParamsFromCacheByName @ @mlc-ai_web-llm.js?v=0336b5a7:7618
LLMChatPipeline @ @mlc-ai_web-llm.js?v=0336b5a7:13990
(anonymous) @ @mlc-ai_web-llm.js?v=0336b5a7:20167
fulfilled @ @mlc-ai_web-llm.js?v=0336b5a7:2240

index.tsx:98 ExitStatus {name: 'ExitStatus', message: 'Program terminated with exit(1)', status: 1}
overrideMethod @ hook.js:608
#init_model @ index.tsx:9


See also #640 for this same error. The error also occurs on the https://chat.webllm.ai/ site

2) Updating model_lib

The next attempt was to modify the model_lib, by pointing to v0_2_80 instead of v0_2_48 like in the config.

Code

const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/mlc-ai/Phi-3.5-vision-instruct-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/Phi-3.5-vision-instruct-q4f16_1-ctx4k_cs2k-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);

Error

This error seems to be an issue with the wasm file itself:

LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callable

See also #700 for this same error

3) Using Gemma-3

I also tried to use Gemma-3.

Code

const modelName = "gemma-3-1b-it-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/mlc-ai/gemma-3-1b-it-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/gemma3-1b-it-q4f16_1-ctx4k_cs1k-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);

Error

This is the same error as above

LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callable

4) Compile models

Per the comment in 700, I had to compile the models, which involved building tvm and mlc-llm from source.

I tried it pointing to my Github file and Huggingface repo:

const modelName = "gemma-3-1b-it-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/charlesLoder/gemma-3-1b-it-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/charlesLoder/mlc-compiled-models/refs/heads/main/libs/gemma-3-1b-it-q4f16_1-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);

And got basically the same error as before:

LinkError: WebAssembly.instantiate(): Import #0 "env" "TVMFFIEnvSetStream": function import requires a callable

The only difference being #0 instead of #2.

Questions

  • I assumed compiling the models after building tvm and mlc-llm from source would've worked — did I do something wrong in how I compiled them? (I'm way out of my element doing that)
  • Is the slightly different error significant?
  • Is there something else I'm missing?

I'd really love to get these models working in the browser, and this seems to be the tool for it.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions