Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating DivHGNN #2

Open
emduc opened this issue Jun 18, 2024 · 1 comment
Open

Evaluating DivHGNN #2

emduc opened this issue Jun 18, 2024 · 1 comment

Comments

@emduc
Copy link

emduc commented Jun 18, 2024

Hello again :)

Having tried different approach to make small changes to DivHGNN, I wanted to have an idea of exactly how each part of your design contributes to the very good result highlithed in the paper. Nonetheless, I cannot seem to reproduce it with the simple full_eval() ran every epoch. Without pruning, I only get up to 66.45% AUC when running full_eval(). The paper mentions on average convergence after 24 epochs, how exactly do you choose the model? By simply taking the best performing one (ie no val dataset)? Full_eval does the exponential decay, but I noticed that if I run full_eval on a fraction of the edges, say the first 10'000 sessions, I get much better result than running on everything as well, is that expected?

I'm sorry for digging up into the model, but I greatly appreciate what you have built and I'd love to be able to accurately reproduce it.

@aSeriousCoder
Copy link
Owner

Thank you for your message and for your continued interest in our work.

The observation you made about achieving better results using the first 10,000 test samples is something we have also encountered in our experiments. This phenomenon might be due to the high temporal relevance in news recommendations, where shorter time intervals often yield better results.

DivHGNN relies on neighbor sampling, and the choice of neighbors significantly impacts the results. Consequently, the training process can be somewhat unstable, and the selection of random seeds can influence the outcomes. Before releasing the code, we conducted several tests to ensure the robustness of the settings across different environments. If you are not achieving the expected results in your setup, you might try using the following seeds, which worked well in our experiments: 2048, 2233, and 2333.

The results reported in our paper were based on extensive experimentation and fine-tuning in our specific environment, so reproducing these results might present some challenges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants