When evaluating the model output using nervaluate.Evaluator, I expected the possible value for the label country to be 1 since there is only one occurrence of country in the gold data. However, the results_per_tag['country']['strict'] output shows possible as 2.
Code:
from nervaluate import Evaluator
labels = ['country', 'postcode', 'city']
true = [[{"label": "postcode", "start": 0, "end": 6, "original_string": "529479"}, {"label": "country", "start": 15, "end": 17, "original_string": "中国"}]]
preds = [[{'label': 'postcode', 'start': 0, 'end': 6, 'original_string': '529479'},
{'label': 'city', 'start': 9, 'end': 15, 'original_string': '漯河市'},
{'label': 'country', 'start': 15, 'end': 17, 'original_string': '中国'}]]
evaluator = Evaluator(true, preds, tags=labels)
results, results_per_tag, result_indices, result_indices_by_tag = evaluator.evaluate()
print(results_per_tag['country']['strict'])
Observed output:
{'correct': 1,
'incorrect': 1,
'partial': 0,
'missed': 0,
'spurious': 0,
'possible': 2,
'actual': 2,
'precision': 0.5,
'recall': 0.5,
'f1': 0.5}
Please let me know if there is any misunderstanding in the way possible counts are calculated for each label or if there are other conditions affecting the possible value.
When evaluating the model output using nervaluate.Evaluator, I expected the possible value for the label country to be 1 since there is only one occurrence of country in the gold data. However, the results_per_tag['country']['strict'] output shows possible as 2.
Code:
Observed output:
Please let me know if there is any misunderstanding in the way possible counts are calculated for each label or if there are other conditions affecting the possible value.