The embedding from the hugging face model and local model are not the same.

Hello,

I encountered a discrepancy when comparing the image embeddings extracted using the `sam2_predictor.get_image_embedding()` function after running inference with the Hugging Face model vs. the local Grounding DINO model.

Specifically:
- I follow the demo script to load the model
- Then I run detection using either:
  - The local model from this repo (`GroundingDINO_SwinT_OGC.py`)
  - Or the Hugging Face model (`IDEA-Research/grounding-dino-tiny`)
- After that, I extract the image embedding using `sam2_predictor.get_image_embedding()`

Despite using the same image and prompt, the resulting embeddings are significantly different.

Is this expected behavior? Shouldn't the `get_image_embedding()` function produce the same output as long as the input image remains the same, regardless of which detection model provided the boxes?

Any insights into what may be causing this difference would be appreciated.

Thanks again!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The embedding from the hugging face model and local model are not the same. #109

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The embedding from the hugging face model and local model are not the same. #109

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions