discrepant result due to sentence segmentation #8

forrestbao · 2022-12-04T07:40:48Z

I am comparing the results that is already in the repo and the result at /home/turx/EvalBase/results. Some method have different results in these two locations.

For example, this is a segment from realsumm_abs_summary.txt from the repo, under pearsonr:

https://github.com/SigmaWe/DocAsRef_0/blob/de4de4b4275e661621bebf3b2f92d8676e2f81c2/results/realsumm_abs_summary.txt#L8-L13

But this is the corresponding segment from /home/turx/EvalBase/results, also under pearsonr:

trad      bertscore-sentence       P                       0.087
                                   R                       0.308
                                   F                       0.233
new       bertscore-sentence      P                      -0.067
                                   R                       0.326
                                   F                       0.222

I am really worried about the code accuracy. Please find out what has changed that caused this discrepancy. We have had incidents like this before. I do not want to have it again. I do not want to publish a paper based on wrong results.

The text was updated successfully, but these errors were encountered:

TURX · 2022-12-04T07:42:31Z

Last time, we have used segmentation by split the pieces by "." (i.e., doc.split(".")), now we are using spacy segmentation on bertscore-sentence.

forrestbao · 2022-12-04T07:49:44Z

Please check whether Spacy's sentence segmentation output makes sense on the test data. The test data frequently contains lexical noises. Paste some examples, inputs and outputs here, so we can investigate, using both `doc.split(".") and Spacy.

forrestbao assigned TURX Dec 4, 2022

forrestbao added the experiment label Dec 4, 2022

forrestbao changed the title ~~discrepant result~~ discrepant result due to sentence segmentation Jan 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

discrepant result due to sentence segmentation #8

discrepant result due to sentence segmentation #8

forrestbao commented Dec 4, 2022

TURX commented Dec 4, 2022

forrestbao commented Dec 4, 2022

discrepant result due to sentence segmentation #8

discrepant result due to sentence segmentation #8

Comments

forrestbao commented Dec 4, 2022

TURX commented Dec 4, 2022

forrestbao commented Dec 4, 2022