Gemma3 Multimodal optimization #404

jiminha · 2025-10-14T14:16:35Z

This is to optimize gemma3 multimodal memory/performance.

bucket vision tower based on batch bucket to reduce recompile overhead
modify merge_multimodal to use torch.where instead of masked_scatter for performance issue
add warmup multimodal bucket to precompile vision tower
port PT_HPU_SDPA_QKV_SLICE_MODE_FWD feature from vllm-fork v0 : this is necessary to reduce the memory for the longer sequence length.

github-actions · 2025-10-14T14:16:48Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

Signed-off-by: Jimin Ha <[email protected]>

Signed-off-by: Mohit Deopujari <[email protected]>

Signed-off-by: Jimin Ha <[email protected]>

Reduces memory usage for long sequences by eliminating dual attention mask creation. Improves capacity from 150 to 400 images with 8K prompts by avoiding OOM issues. Limitation: Only available when block_list is None. Signed-off-by: Jimin Ha <[email protected]>

github-actions · 2025-10-15T20:16:20Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

This is a Draft PR. Please mark it as 'Ready for Review' to trigger the CI.

jiminha · 2025-10-15T20:20:50Z

@xuechendi Could you review this? This includes model file and some utils changes which are necessary for gemma3 model optimization.
@adobrzyn Could you review this? This includes multimodal warmup, added vision bucket into bucketing. Also ported PT_HPU_SDPA_QKV_SLICE_MODE_FWD feature from V0.

github-actions · 2025-10-15T21:18:59Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
f57438338d819c8e3e7e70293281c575ebd77411

Signed-off-by: Jimin Ha <[email protected]>

github-actions · 2025-10-17T16:18:00Z

🚧 CI Blocked

The main CI workflow was not started for the following reason:

Your branch is behind the base branch. Please merge or rebase to get the latest changes.

jiminha requested review from adobrzyn, afierka-intel, kzawora-intel, mgawarkiewicz-intel, michalkuligowski, mswiniarsk, vivekgoe and xuechendi as code owners October 14, 2025 14:16

jiminha marked this pull request as draft October 14, 2025 14:16

jiminha and others added 6 commits October 15, 2025 11:54

Initial code for multimodal warmup for Gemma

9f6c2b8

Signed-off-by: Jimin Ha <[email protected]>

Update warmup

4ade49e

Signed-off-by: Jimin Ha <[email protected]>

Overiding the _process_image_input for Gemma3

d5b7c49

Signed-off-by: Mohit Deopujari <[email protected]>

Compile error fix on latest vllm merge

e63d1ef

Signed-off-by: Jimin Ha <[email protected]>

Update _merge_multimodal_embedding for perf issue in HPU

29d3f18

Signed-off-by: Jimin Ha <[email protected]>

jiminha force-pushed the multimodalwarmup branch from f21b007 to 5fd5a81 Compare October 15, 2025 20:16

jiminha marked this pull request as ready for review October 15, 2025 20:17

jiminha force-pushed the multimodalwarmup branch from 5fd5a81 to e2238fc Compare October 15, 2025 20:18

jiminha added 2 commits October 17, 2025 09:17

Fix enforce_eager/torch_compile mode error

6c45b6a

Signed-off-by: Jimin Ha <[email protected]>

Merge branch 'main' into multimodalwarmup

c967269

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemma3 Multimodal optimization #404

Gemma3 Multimodal optimization #404

jiminha commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

jiminha commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gemma3 Multimodal optimization #404

Are you sure you want to change the base?

Gemma3 Multimodal optimization #404

Conversation

jiminha commented Oct 14, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

🚧 CI Blocked

Uh oh!

github-actions bot commented Oct 15, 2025

🚧 CI Blocked

Uh oh!

jiminha commented Oct 15, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

✅ CI Passed

Uh oh!

github-actions bot commented Oct 17, 2025

🚧 CI Blocked

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants