You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am comparing the results that is already in the repo and the result at /home/turx/EvalBase/results. Some method have different results in these two locations.
For example, this is a segment from realsumm_abs_summary.txt from the repo, under pearsonr:
But this is the corresponding segment from /home/turx/EvalBase/results, also under pearsonr:
trad bertscore-sentence P 0.087
R 0.308
F 0.233
new bertscore-sentence P -0.067
R 0.326
F 0.222
I am really worried about the code accuracy. Please find out what has changed that caused this discrepancy. We have had incidents like this before. I do not want to have it again. I do not want to publish a paper based on wrong results.
The text was updated successfully, but these errors were encountered:
Please check whether Spacy's sentence segmentation output makes sense on the test data. The test data frequently contains lexical noises. Paste some examples, inputs and outputs here, so we can investigate, using both `doc.split(".") and Spacy.
I am comparing the results that is already in the repo and the result at
/home/turx/EvalBase/results
. Some method have different results in these two locations.For example, this is a segment from
realsumm_abs_summary.txt
from the repo, underpearsonr
:https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/results/realsumm_abs_summary.txt#L8-L13
But this is the corresponding segment from
/home/turx/EvalBase/results
, also underpearsonr
:I am really worried about the code accuracy. Please find out what has changed that caused this discrepancy. We have had incidents like this before. I do not want to have it again. I do not want to publish a paper based on wrong results.
The text was updated successfully, but these errors were encountered: