Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantitative results #4

Open
phucty opened this issue Sep 21, 2019 · 2 comments
Open

Quantitative results #4

phucty opened this issue Sep 21, 2019 · 2 comments

Comments

@phucty
Copy link

phucty commented Sep 21, 2019

Thank you very much for your time to update README (#3), I can run your code now.

But I got a problem that the final quantitative results (using literal) is lower than original one (e.g. ConvE - https://github.com/TimDettmers/ConvE).
Additionally, it seems that the results in Table 3 (LiteralE paper) are quite low than other papers such as [1] [2].
I am sorry for this inconveniences, but may i miss something?

The followings are result when I runned your scripts. (The detail logs are in the attachment)

File Model FB15k-237 - MRR
main_literal ComplEx 0.2699
main_literal DistMult 0.3154
main_literal ConvE 0.2980
main_literal DistMult_text 0.3143

Phuc

logs_LiteralE.txt

[1] Rudolf Kadlec et al. Knowledge Base Completion: Baselines Strike Back. 2017
[2] Lacroix, Timothée et al. Canonical Tensor Decomposition for Knowledge Base Completion, 2018.

@wiseodd
Copy link
Collaborator

wiseodd commented Sep 21, 2019

Hi again,

we use early stopping, so we usually don't look at the test results of the last epoch. This is mentioned in our paper. So instead, we look at the validation MRR just before it goes lower significantly during training (small difference is fine since the training is stochastic). Once we've found this, we can look at the corresponding test MRR, and report it.

For lower results to ConvE papers: This might be due to hyperparameters search that they do and the fact that the ConvE code in this repo is quite old (the ConvE's authors might have made many changes since then). We believe this doesn't really matter since we're not interested in getting state of the art result. We are rather interested in how big improvements incorporating literals can give.

By the way, just to make it clear: Our code is written on top of ConvE codebase without changing anything other than extending the models with LiteralE. So naturally, if there were issues with ConvE code, we will also be impacted.

@phucty
Copy link
Author

phucty commented Sep 21, 2019

Hi,
Thank you for your answer, I got your purposes on finding degree of improvements when incorporating literals.

Regarding early stopping, could you please let me know how the "lower significantly" threshold? I recheck the paper but I can not find this.

Thank you very much.
Phuc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants