Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/support qwenvl glm4-v (tested) #4377

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

marko1616
Copy link
Contributor

@marko1616 marko1616 commented Jun 19, 2024

What does this PR do?

Fixes #4375

Before submitting

@hiyouga hiyouga added the pending This problem is yet to be addressed label Jun 19, 2024
@marko1616
Copy link
Contributor Author

终于还差一个image的padding处理就能做好训练支持了。

@marko1616
Copy link
Contributor Author

@hiyouga 改的比较多捏,有空帮忙看看这个实现思路行不行。谢谢。

Image.fromarray(image).convert("RGB").save(image_path)
messages[-1]["content"] = template.format_image.apply(content=os.fspath(image_path))[0] + messages[-1]["content"]
elif image is not None and model_args.visual_inputs_type == "vision_message_embed":
messages[-1]["content"] = template.format_image.apply()[0] + messages[-1]["content"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果不是内嵌在文本的image url默认放在最后一个的开头(Qwenvl如果不是开头效果不好)

if model_args.visual_inputs_type == "vision_message_embed":
dataset = dataset.rename_column("image_inputs","images")
print(dataset["images"])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dataset.map不能重用删除的column_name

transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711)),
]
)
return transform(images[0]) if len(images) != 0 else transform(Image.new("RGB", (1120, 1120), (255, 255, 255)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

数据集加载

if model_args.visual_inputs and finetuning_args.freeze_vision_tower:
target_modules = "^(?!.*vision_tower).*(?:{}).*".format("|".join(target_modules))
if model_args.visual_inputs and finetuning_args.freeze_vision:
target_modules = f"^(?!.*{VISION_FREEZE_MAP[model_args.visual_inputs_type]})."+"*(?:{}).*".format("|".join(target_modules))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实还有点小问题,可能会把GLM4的视觉模块附加了

@marko1616
Copy link
Contributor Author

成功跑了训练。

@marko1616 marko1616 changed the title Feature/support qwenvl glm4-v *WORKING DO NOT MERGE* Feature/support qwenvl glm4-v (tested) Jun 23, 2024
@BUAADreamer BUAADreamer self-requested a review June 28, 2024 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature request] 支持Qwen-VL
3 participants