generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Open
Labels
⚡ PEFTRelated to PEFTRelated to PEFT❓ questionSeeking clarification or more informationSeeking clarification or more information
Description
Reproduction
I have read this blog https://huggingface.co/learn/cookbook/fine_tuning_vlm_trl and tried to implement it myself. However, it seems like the model usually converges into a local solution.
It uses this data format
return {
"images": [sample["image"]],
"messages": [
{
"role": "system",
"content": [{"type": "text", "text": system_message}],
},
{
"role": "user",
"content": [
{
"type": "image",
"image": sample["image"],
},
{
"type": "text",
"text": sample["query"],
},
],
},
{
"role": "assistant",
"content": [{"type": "text", "text": sample["label"][0]}],
},
],
}
I then found (https://huggingface.co/docs/trl/en/sft_trainer#training-vision-language-models) and relized the prompt and completion should be in two different field to make it work. And I used this and it worked
images = [sample["image"].resize((512, 512))]
prompt = [
{
"role": "user",
"content": [
{
"type": "image",
"image": sample["image"],
},
{
"type": "text",
"text": sample['prompt'],
}
],
},
]
completion = [{
"role": "assistant",
"content": sample["completion"],
}]
return {
"images": images,
"prompt": prompt,
"completion": completion,
}
Is this intended or just version difference? I am training Qwen-2.5-VL-3B.
System Info
- Platform: Linux-6.1.123+-x86_64-with-glibc2.35
- Python version: 3.12.11
- TRL version: 0.24.0.dev0
- PyTorch version: 2.8.0+cu126
- accelerator(s): NVIDIA A100-SXM4-40GB
- Transformers version: 4.56.1
- Accelerate version: 1.10.1
- Accelerate config: not found
- Datasets version: 4.0.0
- HF Hub version: 0.34.4
- bitsandbytes version: 0.47.0
- DeepSpeed version: not installed
- Diffusers version: 0.35.1
- Liger-Kernel version: not installed
- LLM-Blender version: not installed
- OpenAI version: 1.106.1
- PEFT version: 0.17.1
- vLLM version: not installed
Checklist
- I have checked that my issue isn't already filed (see open issues)
- I have included my system information
- Any code provided is minimal, complete, and reproducible (more on MREs)
- Any code provided is properly formatted in code blocks, (no screenshot, more on code blocks)
- Any traceback provided is complete
Metadata
Metadata
Assignees
Labels
⚡ PEFTRelated to PEFTRelated to PEFT❓ questionSeeking clarification or more informationSeeking clarification or more information