Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run benchmarking with the supported models on BEIR MSMARCO #177

Open
HAKSOAT opened this issue Jan 12, 2025 · 1 comment
Open

Run benchmarking with the supported models on BEIR MSMARCO #177

HAKSOAT opened this issue Jan 12, 2025 · 1 comment
Labels
evaluation help wanted Extra attention is needed

Comments

@HAKSOAT
Copy link
Collaborator

HAKSOAT commented Jan 12, 2025

We need to run benchmarking on the BEIR MSMARCO dataset, to have a better understanding of how the models are performing for retrieval tasks.

We can use the test split available on Hugging Face hub:

QRels
Corpus

Proposed metrics:

  • NDCG@10
  • Precision@10
  • Recall@100

Considering non-judged documents as non-relevant.

@HAKSOAT HAKSOAT added help wanted Extra attention is needed evaluation labels Jan 12, 2025
@HAKSOAT
Copy link
Collaborator Author

HAKSOAT commented Jan 17, 2025

The InformationRetrievalEvaluator from SentenceTransformers can be helpful for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
evaluation help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant