You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I notice that a threshold is needed before the evaluation of bucc2018 on dev or test datasets, which is determined by the gold file. We can get the threshold for dev datasets but we can not get the threshold for test process. So, I can only get the prediction files without score filter process and submit it to leadboard since I can not find the test gold file of bucc2018. Is it right? I guess it will effect the performance of bucc2018.
According to the code in third_party/utils_retrieve.py, we should dertermine the threshold before generating prediction file:
Hi,
I notice that a threshold is needed before the evaluation of bucc2018 on dev or test datasets, which is determined by the gold file. We can get the threshold for dev datasets but we can not get the threshold for test process. So, I can only get the prediction files without score filter process and submit it to leadboard since I can not find the test gold file of bucc2018. Is it right? I guess it will effect the performance of bucc2018.
According to the code in third_party/utils_retrieve.py, we should dertermine the threshold before generating prediction file:
def bucc_eval(candidates_file, gold_file, src_file, trg_file, src_id_file, trg_id_file, predict_file, mode, threshold=None, encoding='utf-8'):
candidate2score = read_candidate2score(candidates_file, src_file, trg_file, src_id_file, trg_id_file, encoding)
threshold = bucc_optimize(candidate2score, gold)
bitexts = bucc_extract(candidate2score, threshold, predict_file)
The text was updated successfully, but these errors were encountered: