Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of bucc2018 #60

Open
chenQ1114 opened this issue Mar 5, 2021 · 0 comments
Open

Evaluation of bucc2018 #60

chenQ1114 opened this issue Mar 5, 2021 · 0 comments

Comments

@chenQ1114
Copy link

chenQ1114 commented Mar 5, 2021

Hi,

I notice that a threshold is needed before the evaluation of bucc2018 on dev or test datasets, which is determined by the gold file. We can get the threshold for dev datasets but we can not get the threshold for test process. So, I can only get the prediction files without score filter process and submit it to leadboard since I can not find the test gold file of bucc2018. Is it right? I guess it will effect the performance of bucc2018.

According to the code in third_party/utils_retrieve.py, we should dertermine the threshold before generating prediction file:

def bucc_eval(candidates_file, gold_file, src_file, trg_file, src_id_file, trg_id_file, predict_file, mode, threshold=None, encoding='utf-8'):
candidate2score = read_candidate2score(candidates_file, src_file, trg_file, src_id_file, trg_id_file, encoding)
threshold = bucc_optimize(candidate2score, gold)
bitexts = bucc_extract(candidate2score, threshold, predict_file)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant