Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The function in Class: LocalAggregationLossModule #1

Open
sudalvxin opened this issue Nov 20, 2019 · 6 comments
Open

The function in Class: LocalAggregationLossModule #1

sudalvxin opened this issue Nov 20, 2019 · 6 comments

Comments

@sudalvxin
Copy link

I cannot understand the formulation in function "def _softmax". What is the value of 2876934.2 ?

@chengxuz
Copy link
Contributor

Hi Zhanxuan,

Thanks for your question. We should have made it clearer in the codes. In this "_softmax" function, Z serves as a normalization factor for the non-parametric softmax formulation. This value should be \sum_0^N exp(v_i^T v / \tau) under the non-parametric softmax framework. As this framework was first introduced in the Instance Discrimination paper (IR method in our paper), we refer to the source codes of that paper for the implementation and in their implementation, this "Z" value is only computed at the beginning using initial weights and then fixed. This magic number "2876934.2" is the computed Z from the initial weights. As this number is proportional to the number of data points, we have another scale according to "data_len" in the function.

That being said, this "Z" value actually doesn't influence our loss. Because our loss is a conditional probability, "Z" will finally be cancelled out in our loss.

Please let me know if you still have questions! Sorry for the confusion!

@sudalvxin
Copy link
Author

Thanks for your reply. I will check the source codes of IR.

@sudalvxin
Copy link
Author

Yes! I find that 'Z' will finally be cancelled out in Eq.(3).

@WonderSeven
Copy link

Hi, a similar question. When I use the _softmax() function, the output prob is always too large to optimize (about 1600), since 2876934.2 does not make sence, can I manually set this parameter and 1281167 to make the output much smaller?

@chengxuz
Copy link
Contributor

Hi SuiAn,

That is possible, although this value does not make a difference in our algorithm. You can also check Instance Discrimination's original implementation.

@WonderSeven
Copy link

WonderSeven commented Dec 13, 2019

Thanks for your reply, the original implementation of Instance Discrimination is here. I will check it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants