Getting vision models to work — Phi 3.5, Gemma 3

I really love the idea behind this project, but I'm struggling to get vision models to work properly. My experiences aren't unique it seems judging from the other issues.

I'll explain all that I did, and try to link to other issues in a hope to consolidate some information. See the questions at the bottom.

Apologies in advance for the length.

## What I've tried

For each example, cleared cache between runs using:

```js
(await window.caches.keys()).forEach(async k => k.includes("webllm") && await window.caches.delete(k))
```

### 1) Basic usage

Following the example outlined in the docs.

**Code**
```js
const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";
const engine = new MLCEngine({
  initProgressCallback: (p) => {
    console.log(p);
  }
});

await engine.reload(modelName);
```

The cache will populate fine, but then the following error occurs.

**Error**

<details>
  <summary>
    ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
  </summary>
  <code>
    <pre>
01b7e676:0x160422 [FATAL] /Users/cfruan/Documents/tvm/web/../src/runtime/relax_vm/ndarray_cache_support.cc:333: ValueError: Cannot find parameter in cache: vision_embed_tokens.img_processor.vision_model.embeddings.position_embedding.q_weight
overrideMethod @ hook.js:608
put_char @ @mlc-ai_web-llm.js?v=0336b5a7:4635
write @ @mlc-ai_web-llm.js?v=0336b5a7:4604
write @ @mlc-ai_web-llm.js?v=0336b5a7:5723
doWritev @ @mlc-ai_web-llm.js?v=0336b5a7:6275
_fd_write @ @mlc-ai_web-llm.js?v=0336b5a7:6287
$func1677 @ 01b7e676:0x160422
$func1684 @ 01b7e676:0x16099f
$func1685 @ 01b7e676:0x160a4c
$func1911 @ 01b7e676:0x16de87
$func1781 @ 01b7e676:0x16a360
$func1798 @ 01b7e676:0x16aca1
$func1799 @ 01b7e676:0x16acf3
$_ZN3tvm7runtime6detail12LogFatalImplERKNSt3__212basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEiSA_ @ 01b7e676:0x20a86
$func20 @ 01b7e676:0x204bd
$func459 @ 01b7e676:0x6964b
$func458 @ 01b7e676:0x69269
$func1208 @ 01b7e676:0xe3df7
$TVMFuncCall @ 01b7e676:0x271f6
packedFunc @ @mlc-ai_web-llm.js?v=0336b5a7:8528
getParamsFromCacheByName @ @mlc-ai_web-llm.js?v=0336b5a7:7618
LLMChatPipeline @ @mlc-ai_web-llm.js?v=0336b5a7:13990
(anonymous) @ @mlc-ai_web-llm.js?v=0336b5a7:20167
fulfilled @ @mlc-ai_web-llm.js?v=0336b5a7:2240

index.tsx:98 ExitStatus {name: 'ExitStatus', message: 'Program terminated with exit(1)', status: 1}
overrideMethod @ hook.js:608
#init_model @ index.tsx:9
    </pre>
  </code>
</details>

See also #640 for this same error. The error also occurs on the https://chat.webllm.ai/ site



### 2) Updating `model_lib`

The next attempt was to modify the `model_lib`, by pointing to [`v0_2_80`](https://github.com/mlc-ai/binary-mlc-llm-libs/tree/main/web-llm-models/v0_2_80) instead of `v0_2_48` like in the [config](https://github.com/mlc-ai/web-llm/blob/e4b4dc2952afb61f048e4422459b038a88684fd7/src/config.ts#L711-L725).

**Code**

```js
const modelName = "Phi-3.5-vision-instruct-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/mlc-ai/Phi-3.5-vision-instruct-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/Phi-3.5-vision-instruct-q4f16_1-ctx4k_cs2k-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);
```

**Error**

This error seems to be an issue with the wasm file itself:

```bash
LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callable
```

See also #700  for this same error




### 3) Using Gemma-3

I also tried to use Gemma-3.

**Code**

```js
const modelName = "gemma-3-1b-it-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/mlc-ai/gemma-3-1b-it-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/mlc-ai/binary-mlc-llm-libs/refs/heads/main/web-llm-models/v0_2_80/gemma3-1b-it-q4f16_1-ctx4k_cs1k-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);
```

**Error**

This is the same error as above

```bash
LinkError: WebAssembly.instantiate(): Import #2 "env" "TVMFFIWasmSafeCall": function import requires a callable
```


### 4) Compile models

Per the comment in [700](https://github.com/mlc-ai/web-llm/issues/700#issuecomment-3336747883), I had to compile the models, which involved building `tvm` and `mlc-llm` from source.

- See the [lib file on Github](https://github.com/charlesLoder/mlc-compiled-models)
- See the [model on  Huggingface](https://huggingface.co/charlesLoder/gemma-3-1b-it-q4f16_1-MLC/tree/main)

I tried it pointing to my Github file and Huggingface repo:

```js
const modelName = "gemma-3-1b-it-q4f16_1-MLC";

const engine = new MLCEngine({
  appConfig: {
    model_list: [{
      model: "https://huggingface.co/charlesLoder/gemma-3-1b-it-q4f16_1-MLC",
      model_id: modelName,
      model_type: ModelType.VLM,
      model_lib: "https://raw.githubusercontent.com/charlesLoder/mlc-compiled-models/refs/heads/main/libs/gemma-3-1b-it-q4f16_1-webgpu.wasm",
    }, ],
  },
  initProgressCallback: (p) => {
    console.log(p);
  },
});

await engine.reload(modelName);
```

And got basically the same error as before:

```bash
LinkError: WebAssembly.instantiate(): Import #0 "env" "TVMFFIEnvSetStream": function import requires a callable
```

The only difference being `#0` instead of `#2`.



## Questions

- I assumed compiling the models after building `tvm` and `mlc-llm` from source would've worked — did I do something wrong in how I compiled them? (I'm way out of my element doing that)
- Is the slightly different error significant?
- Is there something else I'm missing?

I'd really love to get these models working in the browser, and this seems to be the tool for it.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting vision models to work — Phi 3.5, Gemma 3 #727

What I've tried

1) Basic usage

2) Updating `model_lib`

3) Using Gemma-3

4) Compile models

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Getting vision models to work — Phi 3.5, Gemma 3 #727

Description

What I've tried

1) Basic usage

2) Updating model_lib

3) Using Gemma-3

4) Compile models

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2) Updating `model_lib`