Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional Zero Shot Models #171

Open
mkammes opened this issue Apr 3, 2024 · 1 comment
Open

Additional Zero Shot Models #171

mkammes opened this issue Apr 3, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@mkammes
Copy link

mkammes commented Apr 3, 2024

Is your feature request related to a problem? Please describe.
No.

Describe the solution you'd like
Additional Zero Shot models; such as Grounding DINO. Maybe Detectron2 or Segment Anything. However, Grounding DINO - which is promptable - would be great.

Describe alternatives you've considered
n/a

Additional context
The Grounding DINO model is promptable and apparently scores higher than CLIP.

@mkammes mkammes added the enhancement New feature or request label Apr 3, 2024
@octimot
Copy link
Owner

octimot commented Apr 11, 2024

Hey there!

I think Segment Anything / Grounding DINO are creating more restrictive embeddings due to their promptable nature (more focused training data). In other words, CLIP on its own allows you to search using more "obscure" language, while others might be restricted to more common words (car, sky, bird, face etc.)

We're preparing an update which also allows the use of GPT-Vision and LLaVA-like models that would allow you to ingest and prompt directly too.

But, I'll take a look at these too ASAP!

Cheers

@octimot octimot self-assigned this Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants