Skip to content

Add multimodal preprocessing pipeline#2

Merged
Adaozuishuai merged 1 commit intomainfrom
codex/fine-tune-qwen-7b-with-lora-and-rag
Nov 16, 2025
Merged

Add multimodal preprocessing pipeline#2
Adaozuishuai merged 1 commit intomainfrom
codex/fine-tune-qwen-7b-with-lora-and-rag

Conversation

@Adaozuishuai
Copy link
Copy Markdown
Owner

Summary

  • add reusable configuration and preprocessing modules for images, text, and structured data
  • provide a CLI (preprocess_dataset.py) to convert JSONL manifests into normalized tensors and prompts
  • document the alignment workflow, configuration, and outputs in the README

Testing

  • python -m compileall preprocess_dataset.py preprocessing

Codex Task

@Adaozuishuai Adaozuishuai merged commit ddffec0 into main Nov 16, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant