-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
preprocessed mimic-cxr annotation seems to be different with others #7
Comments
Thanks for your interest. We referred to the official preprocessing code provided for parsing the reports, which contain both impressions and findings. Considering that both sections are crucial for a complete report, we retained them entirely. Regarding the experimental phenomenon you observed, could you please provide the experimental results and specify "others"? |
Thanks for your response! for example, R2Gen is a famous baseline, I notice you directly adopted the results given in their article. |
This is the result of using your code to only predict the finding section, the CIDEr is a little lower |
Thank you for pointing out this issue. As we have mentioned, our work processes the official dataset, which includes both the impression and findings sections. The impression is a crucial part of a diagnostic report, and we should not omit it just because an earlier work did not use it. Additionally, we would like to clarify that in the results reported in our paper, any method that was not fairly reproduced is marked with a dagger symbol. This is because many works do not have open-source code, in my experience, even different data preprocessing can result in variations in outcomes, making comparisons with these methods difficult to ensure absolute fairness. You can refer to methods without the dagger symbol, these are the ones we have replicated ourselves under the same experimental setup and can be compared more fairly. Still, it's interesting to note your point. We trained our model using only findings and achieved the following results: {'Bleu_1': 0.404, 'Bleu_2': 0.252, 'Bleu_3': 0.169, 'Bleu_4': 0.121, 'ROUGE_L': 0.277, 'METEOR': 0.155, 'CIDEr': 0.209}. If you prefer to use only findings, you are welcome to use this result. |
Thank you for your attention and efforts of retraining model on the finding section. I am willing to use your result. I agree with you about "different data preprocessing can result in variations in outcomes" |
hi, can you give your results of CE about your model only predicting the finding section? I think this result is also different with your results in your paper, which predict both impression and finding. Thanks. |
your preprocessed mimic-cxr annotation contains impression and finding. However, other's setting usually only use finding section of a report, such as R2Gen model. And I run your model on annotations only containging finding, the result is lower than other's. Can you explain? I'm not sure if I made a mistake somewhere
The text was updated successfully, but these errors were encountered: