[DRAFT]: Adding save to gguf support for qwen2_vl #1904

Captain-T2004 · 2025-03-05T19:50:02Z

[DRAFT] GGUF Support for Qwen2 Vision Models

Feature Overview

Aiming to provide direct GGUF export capability for vision finetunes, supporting all available Qwen2 Vision Models.

Expectations Details

Enables direct export of vision finetunes to GGUF format
Compatible with the complete range of Qwen2 Vision Models

Current Progress

Modifications to save.py logic allows it to export 2 GGUF files of vision models directly by running the save_pretrained_to_gguf method. One file is for the LLM part and the other is for the vision encoder (mmproj file).
The qwen2-vl-surgery.py is a modified version of the original file found in llama.cpp that uses GPU instead of CPU and generates the vision encoder.

Current Issues

The LLM part, when tested with the original model mmproj file, works perfectly, suggesting that the LLM part is saved successfully.
When the LLM part is used with the extracted vision encoder (mmproj), it gives vague output "GGGGGGGGGGGGGGG........".
The original qwen2-vl-surgery.py file exceeds RAM usage when run directly, and the custom qwen2-vl-surgery.py we have added works with original models' safetensors.

What We Have Tried

Optimizing original qwen2-vl-surgery.py to run on GPU instead of CPU to prevent exceeding memory usage.
Tried running qwen2-vl-surgery.py on different model formats like bin and safetensors.

Contributors

adityaghai07, Captain-T2004

adityaghai07 · 2025-03-05T20:05:37Z

vdonchev helped out a lot in clearing doubts about the vision_encoders and splitting of vlms. I believe we have followed the correct approach regarding exporting vlms to GGUF format.

The vision-encoder(mmproj) file when extracted directly from the original model. ( Just run the surgery file without passing model path and it utilizes the original Qwen2_vl 2B model from HuggingFace ) works well with the finetuned llm part in GGUF format.

The Possible Issue : When the saved vision-encoder is used to run the model , it produces warnings that a lot of tensor weights are missing and produces vague outputs.

Adding save to gguf support for qwen2_vl

d53878a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT]: Adding save to gguf support for qwen2_vl #1904

[DRAFT]: Adding save to gguf support for qwen2_vl #1904

Captain-T2004 commented Mar 5, 2025

adityaghai07 commented Mar 5, 2025

[DRAFT]: Adding save to gguf support for qwen2_vl #1904

Are you sure you want to change the base?

[DRAFT]: Adding save to gguf support for qwen2_vl #1904

Conversation

Captain-T2004 commented Mar 5, 2025

[DRAFT] GGUF Support for Qwen2 Vision Models

Feature Overview

Expectations Details

Current Progress

Current Issues

What We Have Tried

Contributors

adityaghai07 commented Mar 5, 2025