Skip to content

Conversation

mlinmg
Copy link
Contributor

@mlinmg mlinmg commented Mar 31, 2025

FIX #13251
FIX #13317
FIX #13441
FIX #14346

With this PR I want to add the ovis architecture to VLLM continuing the discussion at AIDC-AI/Ovis#70

Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

@mergify mergify bot added the frontend label Mar 31, 2025
@mlinmg mlinmg force-pushed the Ovis-model-addition branch from 5859227 to 96f0c10 Compare March 31, 2025 15:50
@mergify mergify bot added documentation Improvements or additions to documentation ci/build v1 labels Mar 31, 2025
@mlinmg mlinmg force-pushed the Ovis-model-addition branch from 96f0c10 to 1976b0d Compare March 31, 2025 15:55
@Isotr0py
Copy link
Member

I think you can create a aimv2.py file under vllm/model_executor/models and put them in it (just like clip.py since this is the ViT part).

@mlinmg
Copy link
Contributor Author

mlinmg commented Mar 31, 2025

ok cool and also since they use the Qwen2 image preprocessor and I can't modify it to accept new kwargs, how would you expose the mm_kward to the image_processor call, @JumpingRain how have you implemented it?

@mlinmg
Copy link
Contributor Author

mlinmg commented Mar 31, 2025

I've added the config and the processor classes to the modeling fle since they are not present in transformers

@Isotr0py
Copy link
Member

Seems that there are some changes have been in main already. Can you try to base the PR off main branch?

@JumpingRain
Copy link

@mlinmg @Isotr0py Hello, I'm currently using the following method to set max_partition as an initialization parameter for OvisProcessor:

class OvisProcessor(ProcessorMixin):
    attributes = ["image_processor", "tokenizer"]
    valid_kwargs = ["chat_template"]

    image_processor_class = "AutoImageProcessor"
    tokenizer_class = ("Qwen2Tokenizer", "Qwen2TokenizerFast")

    def __init__(self, image_processor=None, tokenizer=None, chat_template=None, **kwargs):
        self.image_token = "<|image_pad|>" if not hasattr(tokenizer, "image_token") else tokenizer.image_token
        self.video_token = "<|video_pad|>" if not hasattr(tokenizer, "video_token") else tokenizer.video_token
        self.max_partition = kwargs.get('max_partition', 9)
        self.covering_threshold = kwargs.get('covering_threshold', 0.9)
        self.convert_to_rgb = kwargs.get('convert_to_rgb', True)
        self.return_tensors = kwargs.get('return_tensors', 'pt')
        super().__init__(image_processor, tokenizer, chat_template=chat_template, **kwargs)
    

    def preprocess_image(self, image: PIL.Image.Image, max_partition=None, covering_threshold=None, convert_to_rgb=None, return_tensors=None):
        max_partition = max_partition if max_partition is not None else self.max_partition
        covering_threshold = covering_threshold if covering_threshold is not None else self.covering_threshold
        convert_to_rgb = convert_to_rgb if convert_to_rgb is not None else self.convert_to_rgb
        return_tensors = return_tensors if return_tensors is not None else self.return_tensors
        # other code

@runninglsy
Copy link

Thanks for the valuable contributions to Ovis. I kindly suggest the possibility of using 'ovis2' or 'Ovis2' for the model_type or coding to help differentiate it from previous versions like Ovis, Ovis1.5, and Ovis1.6. This approach would also facilitate future versioning, such as Ovis2.X or Ovis3.

@mlinmg
Copy link
Contributor Author

mlinmg commented Apr 1, 2025

Thanks for the valuable contributions to Ovis. I kindly suggest the possibility of using 'ovis2' or 'Ovis2' for the model_type or coding to help differentiate it from previous versions like Ovis, Ovis1.5, and Ovis1.6. This approach would also facilitate future versioning, such as Ovis2.X or Ovis3.

I've modified it but you'll need to also modify the configuration/processin/ auto config section file to have the corrected naming (Ovis2)

Copy link

mergify bot commented Apr 1, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mlinmg.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Apr 1, 2025
@mlinmg
Copy link
Contributor Author

mlinmg commented Apr 2, 2025 via email

@mlinmg mlinmg force-pushed the Ovis-model-addition branch from 23476d8 to 64985c1 Compare April 2, 2025 09:30
@mergify mergify bot removed the needs-rebase label Apr 2, 2025
@JumpingRain
Copy link

@mlinmg @Isotr0py Thank you for your outstanding work; it seems that Ovis is on track to smoothly integrate with Vllm! Is there anything I can assist with on my end?

Additionally, you previously mentioned the need to modify the Ovis Hugging Face files. Considering that after the release, we aim for the weights to be compatible with historical code, we hope to achieve the Ovis HF file modification with minimal changes. In my local tests, I found that this can be accomplished by modifying the config.json and tokenizer config. Could you please specify which parts of the Ovis HF code need to be modified to support Vllm usage? I can make the necessary adjustments quickly.

@Isotr0py
Copy link
Member

In my local tests, I found that this can be accomplished by modifying the config.json and tokenizer config. Could you please specify which parts of the Ovis HF code need to be modified to support Vllm usage? I can make the necessary adjustments quickly.

Currently, this PR require to use a modified tokenizer on HF (tokenizer = "Isotr0py/Ovis2-tokenizer") when initialize vLLM instance, so that users don't need to make any modifications locally. But it would be best if it can be upstream to Ovis2 models repo!

Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Copy link

mergify bot commented Apr 24, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @mlinmg.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Apr 24, 2025
@mergify mergify bot removed the needs-rebase label Apr 24, 2025
Signed-off-by: Isotr0py <[email protected]>
@sukkritsharmaofficial
Copy link

eagerly waiting for this support guys @Isotr0py @DarkLight1337 , let me know if i can help in any way

Copy link
Member

@Isotr0py Isotr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests can still pass on my side locally, let's put this forward!

@Isotr0py Isotr0py enabled auto-merge (squash) April 29, 2025 16:35
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 29, 2025
@Isotr0py Isotr0py merged commit 54072f3 into vllm-project:main Apr 30, 2025
63 checks passed
radeksm pushed a commit to radeksm/vllm that referenced this pull request May 2, 2025
Signed-off-by: Marco <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: Marco <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Signed-off-by: Mu Huai <[email protected]>
zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025
Signed-off-by: Marco <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Co-authored-by: Isotr0py <[email protected]>
Signed-off-by: Yuqi Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation frontend multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
7 participants