Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of FILIP embedding model includes padding vectors in similarity computation #31

Open
hyojeongyunn opened this issue Feb 19, 2025 · 0 comments

Comments

@hyojeongyunn
Copy link

Hello, and thank you for your work on this repository.

I have a question regarding the implementation of the FILIP embedding model in this repository.

In the original FILIP paper, it is mentioned that padding vectors are excluded from similarity computation to prevent performance degradation.

"Unlike Khattab & Zaharia (2020), we discard the padded tokens and use average instead summation of token-wise maximum similarities when computing the image-text alignment, which enhances the cross-modal representation learning and stabilizes training."

However, based on my understanding of the code here, it seems that padding vectors are also being used in the similarity calculation.
In the implementation, FILIP use topk selection in "get_weighted_dense_logits" function of FILIP model.
However, if we use top k value (input argument of get_weighted_dense_logits function) as a larger value than the number of vectors for each text/image sample, then padding vector can be used in the similarity calculation.
And theoretically, selecting top k vectors and dropping vector for padded token is not the same.

https://github.com/Sense-GVT/DeCLIP/blob/main/experiments/filip_experiments/yfcc15m/yfcc15m_vit_filip/config.yaml#L22
https://github.com/Sense-GVT/DeCLIP/blob/main/prototype/model/filip.py#L71-L106

I would like to confirm whether my understanding is correct. If padding vectors are indeed included in the similarity computation, could you clarify the reason behind this design choice?

Thank you for your time and support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant