Skip to content

The embedding from the hugging face model and local model are not the same. #109

@BenTzuHsien

Description

@BenTzuHsien

Hello,

I encountered a discrepancy when comparing the image embeddings extracted using the sam2_predictor.get_image_embedding() function after running inference with the Hugging Face model vs. the local Grounding DINO model.

Specifically:

  • I follow the demo script to load the model
  • Then I run detection using either:
    • The local model from this repo (GroundingDINO_SwinT_OGC.py)
    • Or the Hugging Face model (IDEA-Research/grounding-dino-tiny)
  • After that, I extract the image embedding using sam2_predictor.get_image_embedding()

Despite using the same image and prompt, the resulting embeddings are significantly different.

Is this expected behavior? Shouldn't the get_image_embedding() function produce the same output as long as the input image remains the same, regardless of which detection model provided the boxes?

Any insights into what may be causing this difference would be appreciated.

Thanks again!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions