-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why the performance is very different to other paper? #4
Comments
Hi, The data splits used in the other papers are most likely different than the one used by us. Neither the MINER paper, nor the one referenced by you explicitly mention which split of the MIND dataset they use, so I assume they used the test portion, without the publicly available labels. In contrast, as explained in our paper (Section 2.5), we use the MINDdev portion of the dataset as our test split, and further split the MINDtrain dataset into training and validation portions, respectively. |
Hi, Yes, I understand the different data split can lead to some variances but 10+ AUC differences is too large. The Dev and Test are come from same dataset and should not have dramatically shifting. |
hello, I find the same problem, I wanna know that did you later validate the reliability of this newreclibrary through partitioning training and testing data or hyperparameter tuning, using hyperparameters from original models such as miners. |
Hi Andreea,
I notice that the model performance reported in your paper is very different to the performance in original paper.
For example, MINER (Li et al. 2019) got AUC=69.61 on MIND-small dataset but your reported performance is only AUC=51.2.
Compared to other work reproduced MINER model, this performance is much lower than others. For example, this paper reported that their reproduced MINER model got AUC of 63.88.
In general, most GeneralRec models in your Table 1 got AUC < 52.00, which are largely different to the performance reported in other papers.
Could you give any comments on this?
The text was updated successfully, but these errors were encountered: