You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just saw your paper this morning, great work! I had a quick look here at the repo and noticed that you have looked into a method to better deal with weighted documents than the Anserini "Fake Words" method.
It would be awesome if you could make a PR there, as this is a pain point for indexing huge collections.
We also ran into a super weird corner case bug in the past relating to the jsonvector method, see: castorini/anserini#1843
Anyway, just wanted to point it out because it would be nice to contribute it back (I realise you probably intended to do this anyway, but I may as well mention it while it's on my mind).
Cheers!
The text was updated successfully, but these errors were encountered:
Hi @JMMackenzie,
It is in our plan to merge back to the Anserini. The implementation is ready in this repo, however some tests failed due to the new changes. I will create a pull request soon and discuss how to fix or create new tests.
I don't have a systematic comparison for all the models, but we observed the indexing time of (e.g, EPIC topk=400) reduces from 1 hours to under 15 minutes on our machine.
Hi all,
I just saw your paper this morning, great work! I had a quick look here at the repo and noticed that you have looked into a method to better deal with weighted documents than the Anserini "Fake Words" method.
They have an issue open on this: castorini/anserini#1890
It would be awesome if you could make a PR there, as this is a pain point for indexing huge collections.
We also ran into a super weird corner case bug in the past relating to the jsonvector method, see: castorini/anserini#1843
Anyway, just wanted to point it out because it would be nice to contribute it back (I realise you probably intended to do this anyway, but I may as well mention it while it's on my mind).
Cheers!
The text was updated successfully, but these errors were encountered: