-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect docstring of get_anyres_image_grid_shape
#31588
Comments
@DarkLight1337 Would you like to open a PR to fix this? cc @zucchini-nlp To confirm, as I think this was raised elsewhere and there's a double inversion which happens (?) |
After looking at the code a bit more, now I am more confused. It seems that LLaVA-NeXT model treats it as |
Hey! Yes, this issue has been noticed by several people and I can confirm that our implementation matched perfectly with the LLaVa-NeXT. Yes, there are naming discrepancies between the two, which is confusing but it all comes from the way it's done in the original repo. But if we try to get the correct way, the way is should be as I understand, then there is a "bug" in both implementations. Because LLaVa-NeXT treat is as I raised a question to LLaVa authors a week ago and didn't get a reply yet. So I wouldn't change anything in Hope this clarifies it a bit ;) |
Thanks for the clarification! Let's wait until the authors respond then. |
Upon inspecting the source code, the
image_size
tuple should be in the form(height, width)
instead of(width, height)
transformers/src/transformers/models/llava_next/modeling_llava_next.py
Line 52 in aab0829
The text was updated successfully, but these errors were encountered: